Running Harmonie on Atos

Before you start

New Harmonie users will require membership of the accord user group at ECMWF. Please contact the HIRLAM System Manager, Daniel Santos, to make this request on your behalf. Futhermore ECMWF will have to setup a virtual machine for you to run the ecFlow server on (see here). Finally, make sure that your login shell is set to /bin/bash.

Tip

To share your experiments with the members of the accord group do:

chmod 755 $HOME $SCRATCH $PERM $HPCPERM
chgrp -R accord $HOME/hm_home $SCRATCH/hm_home $HOME/HARMONIE $HPCPERM/hm_home
chmod g+s $HOME/hm_home $SCRATCH/hm_home $HOME/HARMONIE $HPCPERM/hm_home

The chmod g+s sets the SGID bit which will ensure that new experiments created in hm_home will automatically be in the accord group

Configure your experiment (option 1)

  • Create an experiment directory under $HOME/hm_home and use the master script Harmonie to set up a minimum environment for your experiment.

    mkdir -p $HOME/hm_home/my_exp
    cd $HOME/hm_home/my_exp
    ln -sf /path/to/git/repository/config-sh/Harmonie
    ./Harmonie setup -r /path/to/git/repository -h ECMWF.atos

    where

    • -r Specifies the path to the git repository. Make sure you have checkout-ed the correct branch.
    • -h tells which configuration files to use. At ECMWF config.ECMWF.atos is the default one. For harmonie-43h2.2 use -h config.aa
      Tip

      An Atos tagged versions of Harmonie are available in ~hlam/harmonie_release/git/tags/

      ln -sf ~hlam/harmonie_release/git/tags/<taggedversion>/config-sh/Harmonie 
      Harmonie setup -r ~hlam/harmonie_release/git/tags/<taggedversion>  -h ECMWF.atos
  • This would give you the default setup which currently is AROME physics with CANARI+OI_MAIN surface assimilation and 3DVAR upper air assimilations with 3h cycling on a domain covering Denmark using 2.5km horizontal resolution and 65 levels in the vertical.
  • Now you can edit the basic configuration file ecf/config_exp.h to configure your experiment scenarios. Modify specifications for domain, data locations, settings for dynamics, physics, coupling host model etc. Read more about the options here. You can also use some of the predefined configurations by calling Harmonie with the -c option:
    ./Harmonie setup -r PATH_TO_HARMONIE -h YOURHOST -c CONFIG -d DOMAIN
    where CONFIG is one of the setups defined in scr/Harmonie_configurations.pm. If you give -c without an argument or a non existing configuration a list of configurations will be printed.
  • In some cases you might have to edit the general system configuration file config-sh/config.ECMWF.atos. See here for further information.
  • The rules for how to submit jobs on Atos are defined in config-sh/submit.ECMWF.atos. See here for further information
  • If you experiment in data assimilation you might also want to change settings in scr/include.ass.

Configure your experiment using github repo (option 2)

Disadvantage of option 1 for version control in git is that code is located in two places. Instead you can :

  • Make a fork of the Harmonie repository. From now we assume your fork will be located at https://github.com/<user>/Harmonie.
  • Log in to ATOS as usual and perform the following commands:
mkdir -p $PERM/hm_home && cd $PERM/hm_home/
git clone -b <remote_branch> git@github.com:<user>/Harmonie.git  <exp_name>
cd <exp_name>
git checkout -b <feature/branch_name>
export PERL5LIB=$(pwd)
config-sh/Harmonie setup -r $(pwd) -h ECMWF.atos

Where the git clone command clones a specific branch into a directory called <exp_name>. git checkout with the -b flag, then creates a new branch for you to work on. Call it something meaningful. Then the experiment is set up as usual, but using your local repository as reference to itself.

Then you do some work and when ready to commit something you do

git add <path to modified file>
git commit --author "Name <name@host>" -m "Commit message"
git push --set-upstream origin <feature/branch_name>

Specifying --set-upstream origin <feature/branch_name> to git push is only necessary the first time you push your branch to the remote. When ready you can now go to GitHub and make a pull-request to the Harmonie repository from your fork.

Start your experiment

Launch the experiment by giving start time, DTG, end time, DTGEND

./Harmonie start DTG=YYYYMMDDHH DTGEND=YYYYMMDDHH
# e.g., ./Harmonie start DTG=2022122400 DTGEND=2022122406

If successful, Harmonie will identify your experiment name and start building your binaries and run your forecast. If not, you need to examine the ECFLOW log file $HM_DATA/ECF.log. $HM_DATA is defined in your Env_system file. At ECMWF $HM_DATA=$SCRATCH/hm_home/$EXP where $EXP is your experiment name. Read more about where things happen further down.

Continue your experiment

If your experiment have successfully completed and you would like to continue for another period you should write

./Harmonie prod DTGEND=YYYYMMDDHH

By using prod you tell the system that you are continuing the experiment and using the first guess from the previous cycle. The start date is take from a file progress.log created in your $HOME/hm_home/my_exp directory. If you would have used start the initial data would have been interpolated from the boundaries, a cold start in other words.

Start/Restart of ecflow_ui

To start the graphical window for ECFLOW

./Harmonie mon

The graphical window runs independently of the experiment and can be closed and restarted again with the same command. With the graphical interface you can control and view logfiles of each task.

Making local changes

Very soon you will find that you need to do changes in a script or in the source code. Once you have identified which file to edit you put it into the current $HOME/hm_home/my_exp directory, with exactly the same subdirectory structure as in the reference. e.g, if you want to modify a namelist setting

./Harmonie co nam/harmonie_namelists.pm   # retrieve default namelist harmonie_namelists.pm
vi nam/harmonie_namelists.pm              # modify the namelist

Next time you run your experiment the changed file will be used. You can also make changes in a running experiment. Make the change you wish and rerun the InitRun task from the viewer. The InitRun task copies all files from your local experiment directory to your working directory $HM_DATA. Once your InitRun task is complete your can rerun the task you are interested in. If you wish to recompile something you will also have to rerun the Build tasks.

Issues

Harmonie exp stop at ECMWF(Atos) due $PERM mounting problem https://github.com/Hirlam/Harmonie/issues/628

Account

In order to change the billing account, open Env_submit and find the definition of scalar_job. Then add a line like

'ACCOUNT' => $submit_type.' --account=account_name' to the definition of the dictionary.

Directory structure

$SCRATCH

In $SCRATCH/hm_home/$EXP you will find

DirectoryContent
binBinaries
libSource code synced from $HM_LIB and compiled code
lib/srcObject files and source code (if you build with makeup, set by MAKEUP_BUILD_DIR)
lib/utilUtilities such as makeup, gl_grib_api or oulan
climateClimate files
YYYYMMDD_HHWorking directory for the current cycle. If an experiment fails it is useful to check the IFS log file, NODE.001_01, in the working directory of the current cycle. The failed job will be in a directory called something like Failed_this_job.
archiveArchived files. A YYYY/MM/DD/HH structure for per cycle data. ICMSHHARM+NNNN and ICMSHHARM+NNNN.sfx are atmospheric and surfex forecast output files
extractVerification input data. This is also stored on the permanent disk $HPCPERM/HARMONIE/archive/$EXP/parchive/archive/extract
ECF.logLog of job submission

ECFS

  • Since $SCRATCH is cleaned regularly we need to store data permanently on ECFS, the EC file system, as well. There are two options for ECFS, ectmp and ec. The latter is a permanent storage and first one is cleaned after 90 days. Which one you use is defined by the`ECFSLOC variable. To view your data type e.g.

    els ectmp:/$USER/harmonie/my_exp
  • The level of archiving depends on ARSTRATEGY in ecf/config_exp.h. The default setting will give you one YYYY/MM/DD/HH structure per cycle data containing:

    • Surface analysis, ICMSHANAL+0000[.sfx]
    • Atmospheric analysis result MXMIN1999+0000
    • Blending between surface/atmospheric analysis and cloud variable from the first guess LSMIXBCout
    • ICMSHHARM+NNNN and ICMSHHARM+NNNN.sfx are atmospheric and surfex forecast model state files
    • PFHARM* files produced by the inline postprocessing
    • ICMSHSELE+NNNN.sfx are surfex files with selected output
    • GRIB files for fullpos and surfex select files
    • Logfiles in a tar file logfiles.tar
    • Observation database and feedback information in odb_stuff.tar.
    • Extracted files for obsmon in sqlite.tar
  • Climate files are stored in the climate directory

  • One directory each for vfld and vobs data respectively for verification data

$PERM

DirectoryContent
HARMONIE/$EXPecflow log and job files
hm_lib/$EXP/libScipts, config files, ecf and suite, source code (not compiled, set by $HM_LIB). Reference with experiment's changes on top

$HPCPERM

In $HPCPERM/hm_home/$EXP

DirectoryContent
parchive/archive/extract/Verification input data.

$HOME on ecflow-gen-${user}-001

DirectoryContent
ecflow_server/ecFlow checkpoint and log files

Cleanup of old experiments

Danger

These commands may not work properly in all versions. Do not run the removal before you're sure it's OK

Once you have complete your experiment you may wish to remove code, scripts and data from the disks. Harmonie provides some simple tools to do this. First check the content of the different disks by

Harmonie CleanUp -ALL

Once you have convinced yourself that this is OK you can proceed with the removal.

Harmonie CleanUp -ALL -go 

If you would like to exclude the data stored on e.g ECFS ( at ECMWF ) or in more general terms stored under HM_EXP ( as defined in Env_system ) you run

Harmonie CleanUp -d

to list the directories intended for cleaning. Again, convince yourself that this is OK and proceed with the cleaning by

Harmonie CleanUp -d -go

You can always remove the data from ECFS directly by running e.g.

erm -R ec:/YOUR_USER/harmonie/EXPERIMENT_NAME 

or

erm -R ectmp:/YOUR_USER/harmonie/EXPERIMENT_NAME 
  • For more information about cleaning with Harmonie read here
  • For more information about the ECFS commands read here

Debugging Harmonie with ARM DDT

Follow instructions here. Use Run DDT client on your Personal Computer or End User Device