Ensemble mode in the Harmonie script system
Overview
- Purpose
- Prerequisites
- Option checking
- EPS in the tdf file
Purpose
The purpose of this document is to give more details about how ensemble mode works in the Harmonie script system than can easily be found in other pages. It is meant for system people and developers who need to understand or extend the functionality of HarmonEPS. Such extensions could be e.g. implementation of new initial perturbation techniques.
Prerequisites
You should also read the Howto to get acquainted with what is already implemented.
Having read the prerequisite pages you know that an ensemble experiment is not very different from a deterministic one, you only need to set a few ensemble related variables ENSMSEL
, ENSINIPERT
, etc. in ecf/config_exp.h
and then make some member specific exceptions in the perl "module" suites/harmonie.pm
. But there is more going on behind the scenes.
First of all, the ENSMSEL
member selection variable exists for convenience, what is used in suites/harmonie.tdf
and other scripts is an expanded version of it called ENSMSELX
. In the script scr/Start
this expansion is done by invoking script scr/Ens_util.pl
, which is also used to set a couple of other convenience variables:
# Compute derived EPS quantities, needed in harmonie.tdf
export ENSSIZE ENSMFIRST ENSMLAST
ENSSIZE=`perl -S Ens_util.pl ENSSIZE`
ENSMFIRST=`perl -S Ens_util.pl ENSMFIRST`
ENSMLAST=`perl -S Ens_util.pl ENSMLAST`
ENSCTL=`perl -S Ens_util.pl ENSCTL $ENSCTL`
For example, if ENSMSEL=0-8:2
, then ENSMSELX=000:002:004:006:008
, i.e., a colon-separated list of 3-digit numbers. In the same example, we will have ENSMFIRST=000
and ENSMLAST=008
. ENSSIZE
will be 5.
Option checking
As explained in the prerequisite documents, variables normally set in ecf/config_exp.h
can be overridden for specific ensemble members in suites/harmonie.pm
. But how is it verified that the chosen combinations make sense for each member?
The main script that checks for a sensible combination of options in Harmonie is scr/CheckOptions.pl
. This script runs already from the Start script before ecFlow is launched in order to catch problems as early as possible. CheckOptions.pl
reads ecf/config_exp.h
and creates a new file ecf/config_updated.h
(but under "merged" repository directory $HM_LIB
rather than your experiment directory $HM_WD
). In config_updated.h
you will find some environment variables that are derived from others, e.g., a lot of domain specific variables (NLON
,NLAT
,TSTEP
etc.) are derived from $DOMAIN
. Every ECF task in the system includes first config_exp.h
and then config_updated.h
. At the very bottom of config_updated.h
we find:
if [ ${ENSMBR--1} -ge 0]; then
if [ -s $HM_LIB/ecf/config_mbr$ENSMBR.h]; then
. $HM_LIB/ecf/config_mbr$ENSMBR.h
fi
fi
That is, in ensemble mode, if ecf/config_mbr$ENSMBR.h
exists, it will also be sourced by every script. And finally, the way these member specific config files are created can be seen from this passage in the Start
script:
# Check options for individual ensemble members if relevant
if [ $ENSSIZE -gt 0]; then
perl -S CheckMemberOptions.pl || exit
fi
scr/CheckMemberOptions.pl
includes the member specific harmonie.pm
of course, and then it loops over all the selected members in turn, running CheckOptions.pl
with the particular environment settings for the member in question. If a member ($ENSMBR
) passes the tests, the file mentioned above, $HM_LIB/ecf/config_mbr$ENSMBR.h
is created. It will contain settings for those environment variables mentioned in harmonie.pm that differs from the default settings in config_exp.h
. This makes the correct variables available to every script, without those scripts having to repeat the perl &Env
checking in harmonie.pm
.
EPS in the tdf file
In harmonie.tdf
we have many loop constructs like the following example from the MakeCycleInput
family:
family Cycle
task Prepare_cycle
loop(EEE,$ENV{ENSMFIRST},$ENV{ENSMLAST})
if( $ENV{ENSMSELX} =~ /@EEE@\b/ and '@EEE@' ne '-1' )
family Mbr@EEE@
trigger ( Prepare_cycle == complete )
complete ( (../../Hour:HH + 24 - $ENV{BeginHour}) % &Env('FCINT','@EEE@') )
edit ENSMBR @EEE@
task Prepare_cycle
endfamily
endif
endloop
...
As can be seen, all the loops over ensemble members go from the smallest number found in ENSMSEL
(ENSMFIRST
) to the highest number found (ENSMLAST
), with steps of 1, but only if the actual number (@EEE@
) is present in the expanded list ENSMSELX
is anything put to harmonie.def for this potential member. The perl operator =~
is the pattern match operator and \b
means a word boundary (:
or end of string in our case).
Note also how every member has its own family Mbr@EEE@ (which will expand to Mbr000, Mbr002, etc. in the harmonie.def
file). Another important thing to note is the setting
edit ENSMBR @EEE@
This will first create an ECF variable %ENSMBR%*
with different values for each member, which is also turned into a shell variable $ENSMBR
in ecf/head.h
, which is included by all ECF tasks. From head.h
:
ENSMBR=%ENSMBR%
if [ ${ENSMBR--1} -ge 0]; then
ENSMBR=`echo %ENSMBR% | awk '{printf "%%3.3d",$1}'`
CYCLEDIR=%YMD%_${HH}/mbr$ENSMBR
fi
export ENSMBR
The end message of this is that to get ensemble member number in a script you should use $ENSMBR
. In non-ensemble runs ENSMBR
will have the value -1.
The statement
complete ( (../../Hour:HH + 24 - $ENV{BeginHour}) % &Env('FCINT','@EEE@') )
deserves more explanation. It is present to account for the fact that not all members need to have the same "forecast interval" FCINT
. The Hour families in the tdf now look like this:
family Hour
repeat integer HH &Env('FirstHour','min') 23 &Env('FCINT','min')
i.e., the loop steps with the minimum FCINT
value found among the members. Thus, if e.g. some members have FCINT=6
and some FCINT=12
, then the statement above sets the family immediately complete for members with FCINT=12
at those cycles that are not divisible by 12 relative to the first cycle (BeginHour). I.e., if the run was started at a 06 or 18 cycle, members with FCINT=12
will be complete (not run) at 00 and 12 cycles, but if the run was started at a 00 or 12 cycle, then members with FCINT=12
will not run at 06 and 18 cycles. This behaviour has confused many users and should perhaps be changed.
Make member specific namelist changes
In Harmonie most namelists are created on the fly from the namelist dictionary. This allows us to make member specific changes to the namelists used in e.g. the forecast. In the following we will describe two ways of doing this.
Through harmonie.pm
Assume we would like to change on parameter in the physcis. First we change the variable in the namelist to be dependent of an environment variable in nam/harmonie_namelists.pm
NAMPHY0=>{
'ALMAV' => "$ENV{ALMAV}",
...
},
Second we make sure in suites/harmonie.pm
that this environment variable is specified for each member, in this case four.
'ALMAV' => [ '200.','100.','50.','300.'],
Finally we have to make sure that the variable is exported in ecf/config_exp.h
export ALMAV
Make changes to the namelist generation
Another way is to specify a set of namelist changes for each member in nam/harmonie_namelists.pm
. We could simply add a definition for e.g. the first member like
%member_001 = (
NAERAD=>{
'LRRTM' => '.FALSE.,',
},
NAMPHY0=>{
'BEDIFV' => '0.05,',
},
),
To activate the change we also need to change scr/Get_namelist
, the script that builds the namelist for us to take the member_$ENSMBR
change into account.
...
forecast|dfi|traj4d)
NAMELIST_CONFIG="$DEFAULT dynamics $DYNAMICS $PHYSICS ${DYNAMICS}_${PHYSICS} $SURFACE $EXTRA_FORECAST_OPTIONS member_$ENSMBR"
...
Repeat this for all your members with the changes you would like to apply.