Derivation of Structure Functions
General
For each new model domain, in order to carry out upper air data assimilation (3DVAR or 4DVAR) one needs to generate background error covariances (generally referred to as structure functions). The recommended procedure is to use a two step approach. In step one you generate background error statistics by downscaling (this is needed since you do not have have statistics for your domain setup for this forecast model version and physics options, so that you cannot run data-assimilation (unless you use statistics from old system possibly derived from a slighthly different domain and with a different model version, which is not recommended). In step 2 you then use the statistics derived in step 1 to generate the final background error statistics files by applying ensemble data assimilation within the HARMONIE-AROME modelling system.
In step 1 structure functions are generated from difference fields from ensemble members of HARMONIE-AROME forecast. These are obtained from downscaling of ECMWF EDA ensemble forecast. To alleviate spin-up issues, these phase 1 downscaled HARMONIE-AROME forecasts are run up to 6 hours, and differences are generated from these. Using the ECMWF LBC data, 6h HARMONIE ensemble forecasts are initiated from ECMWF 6h forecasts daily from 00 UTC and 12 UTC, with ECMWF forecasts as initial and lateral boundary conditions. To obtain stable statistics, it is recomended to run 4 ensembles for two chosen one-month episode (s). The episodes should sample different seasons. Therefore it is recommended to run for one winter month and one summer month, for example June 2016 and January 2017. These periods are chosen so as to benefit from the latest upgrade to ECMWF's EDA system. Thereby we sample both seasonal (January, July) and daily (00 UTC and 12 UTC) variations. After running of the ensembles the archived results (6h forecasts) are processed to generate structure functions by running a program called 'festat'. Festat will be run automatically within the SMS system when DTGEND is approached by an experiment and the statistics will be based on difference files generated by intermediate program femars and stored on ecfs in ec:/$uid/harmonie/$exp/femars
(software to generate binary GRIB files of forecasts differences after each cycle). This will mean that if you start by running a one month experiment for January the structure functions generated when you reach DTGEND will be for January. When you use the same experiment name and launch also an experiment for July you will when you reach DTGEND have background error statistics based on both January and July differences files (since both of those are now found in ec:/$uid/harmonie/$exp/femars
). These combined winter/summer background error statistics files from phase one are final product from step 1 and can are the intermediate background error statistics files to plug into the HARMONIE-AROME data assimilation of step 2. It should be mentioned that there is a possibility for the more advanced user to run festat off-line and with any combinations of January-July forecast difference files from ec:/$uid/harmonie/$exp/femars
. That will be described in ore detail further below and is something you might want to do with forecasts difference files generated from step 2 to produce monthly background error statistics files by combining in different ways.
In step 2 we run again two one-month ensemble experiments for the same Januar and July months again utilizing ECMWF EDA forecasts as lateral boundary conditions. Again you use 4 ensemble members. The important difference as compared to step 1 is that you now carry out ensemble-data assimilation also within the HARMONIE-AROME framework. You use the background error statistics from phase 1 and do the eda within a data assimilation cycle. This has the important advantage that you significantly reduce spinup caused by the HARMONIE-AROME model adjustments to ECMWF EDA starting initial states. Because of this we can in step 2 derive the statistics from +3h forecast difference (rather than +6 that is used in step 1).
Note that there are methods to circumvent step1 and to technically run 3/4DVAR using structure functions derived from another HARMONIE model domain. Such existing methods include aspects such as horizontal truncation or extrapolation of horizontal spectra and possibly vertical interpolation in between vertical level geometries. Since the recommended procedure is to use the two stp approach described above these alternative methods are not described in detail. Furthermore it should be noted that there are background error covariance related tuning coefficients REDNMC and REDZONE
. Settings of values of these ae not covered here. If you have a new domain you will use the default value 0.6 for REDNMC
and 100 for REDZONE
which are considered apropiate values for the derivation of structure functions. If you re-derive your statistics for an existing domain you will use the REDNMC
and REDZONE
values as assigned in scr/include.ass
.
There are various existing tools for investigating your newly derived structure functions and at the end of this page there are some documentation of existing tools and how to use them.
The procedure for generating structure functions from an ensemble of forecasts is described below for a AROME setup with 2.5 km horizontal resolution and 65 vertical levels. The experiment is run for a one mont winter-period of followed by a one month summer-period on the ECMWF computing system. Forecast differences are derived twice a day (00 forecasts from 12 UTC) from combinations of the four ensemble members. Besides the scientific recommendation to cover many different weather situations there is as well a matemathical constraint that the number of forecast difference files provided to festat needs to be larger than the number of vertical levels used in the forecast model integration. In the section below detailed instructions on how to generate the structure functions are given. The other sections deals with how to diagnose the structure functions recent and ongoing work and future development plans.
It is recommended for future coming enhancements regarding handling of B statistics and diagnostics of it to save all generated forecast difference files as well as stabal.cv, stabal.cvt and stabal.bal and generated .xy
and .y
files (.cvt
.xy
and .y
for diagnotical puroposes):
Generating background error statistics (using 43h2.2)
The following instructions are valid for trunk and any 43h2.2 tags that have been created. These instructions will only work at ECMWF. If you do have a new domain (or are not sure) you should follow that route in step 1 below. New domain creation is described in ModelDomain which links to the useful Domain Creation Tool
STEP 1 Downscaling
Create a new experiment on ECMWF:
In case you do have an existing domain setup do:
mkdir -p $HOME/hm_home/jbdownexp cd $HOME/hm_home/jbdownexp ~hlam/Harmonie setup -c JBDSC -r ~hlam/harmonie_release/git/tags/harmonie-43h2.2.1 -d DOMAIN # where domain is the name of your domain
In case you are creating structure functions for a new domain (or you are not sure):
mkdir -p $HOME/hm_home/jbdownexp cd $HOME/hm_home/jbdownexp ~hlam/Harmonie setup -c JBDSC -r ~hlam/harmonie_release/git/tags/harmonie-43h2.2.1 ~hlam/Harmonie co scr/Harmonie_domains.pm
Then edit scr/Harmonie_domains.pm and add your new domain definition.
The ensemble that will be used to generate the structure functions needs to be defined in suites/harmonie.pm. An edited ensemble configuration file should define a four member ensemble that only varies the boundary memeber input (ENSBDMBR) as follows:
%env = ( # 'ANAATMO' => { 0 => '3DVAR' }, # 'HWRITUPTIMES' => { 0 => '00-21:3,24-60:6' }, # 'SWRITUPTIMES' => { 0 => '00-06:3' }, # 'HH_LIST' => { 0 => '00-21:3' }, # 'LL_LIST' => { 0 => '36,3' }, # 'LSMIXBC' => { 0 => 'no' }, # 'ANASURF' => { 0 => 'CANARI_OI_MAIN' }, 'ENSCTL' => [ '001', '002', '003', '004'], # 'OBSMONITOR' => [ 'obstat'], # SLAFLAG: Forecast length to pick your perturbation end point from # SLAFDIFF: Hours difference to pick your perturbation start point from # SLAFLAG=24, SLAFDIFF=6 will use +24 - +18 # SLAFDIFF=SLAFLAG will retain the original SLAF construction # SLAFK should be tuned so that all members have the same perturbation size 'ENSBDMBR' => [ 1,2,3,4], # 'SLAFLAG' => [ 0, 6, 6, 12, 12, 18, 18, 24, 24, 30, 30], # 'SLAFDIFF' => [ 0, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6], # 'SLAFK' => ['0.0','1.75','-1.75','1.5','-1.5','1.2','-1.2','1.0','-1.0','0.9','-0.9'], # When using ECMWF ENS the members should be defined # # 'ENSBDMBR' => [ 0, 1..10], ### Normally NO NEED to change the settings below
Run for two one-month (30 day) periods:
cd $HOME/hm_home/jbdownexp ~hlam/Harmonie start DTG=2016060100 DTGEND=2016070100 # #~hlam/Harmonie start DTG=2017010100 DTGEND=2017013100
Generate the statistics using festat standalone:
Place yourself at
$TEMP
on ECMWFCopy Festat.standalone to
$TEMP
on ECMWFEdit the script to reflect your user and experiment details (in particular copy femars data
ec:/$uid/harmonie /jbdownexp/femars/
to your femars-directory on$TEMP
)submit with
qsub ./Festat.standalone
you will get a log-file
festat.log
on$TEMP
and results in directoryfestat_wrk
. when the program has finished do:cd festat_wrk emkdir ec:/$uid/jbdata gzip stab_your_exp.cv gzip stab_your_exp.bal ecp stab_your_exp.cv.gz ec:/$uid/jbdata/. (with your own filename and directory) ecp stab_your_exp.bal.gz ec:/$uid/jbdata/. (with your own filename and directory)
(also create a tar.file with all
*.xy
*.y
*.cv
,*.bal
and*.cvt
and put on ecfs for future diagnostical purposes)
STEP 2 Generating background error statistics with EDA cycling (instructions under testing)
Create a new experiment on ECMWF:
In case you do have an existing domain setup do:
mkdir -p $HOME/hm_home/jbedaexp cd $HOME/hm_home/jbedaexp ~hlam/Harmonie setup -c AROME_JBEDA -r ~hlam/harmonie_release/git/tags/harmonie-43h2.2.1 -d DOMAIN # where domain is the name of your domain
In case you are creating structure functions for a new domain (or you are not sure):
mkdir -p $HOME/hm_home/jbedanexp cd $HOME/hm_home/jbedaexp ~hlam/Harmonie setup -c AROME_JBEDA -r ~hlam/harmonie_release/git/tags/harmonie-43h2.2.1 ~hlam/Harmonie co scr/Harmonie_domains.pm
Then edit
scr/Harmonie_domains.pm
and add your new domain definition.The ensemble that will be used to generate the structure functions needs to be defined in suites/harmonie.pm. An edited ensemble configuration file should define a four member ensemble that only varies the boundary memeber input (
ENSBDMBR
) as follows:%env = ( # 'ANAATMO' => { 0 => '3DVAR' }, # 'HWRITUPTIMES' => { 0 => '00-21:3,24-60:6' }, # 'SWRITUPTIMES' => { 0 => '00-06:3' }, # 'HH_LIST' => { 0 => '00-21:3' }, # 'LL_LIST' => { 0 => '36,3' }, # 'LSMIXBC' => { 0 => 'no' }, # 'ANASURF' => { 0 => 'CANARI_OI_MAIN' }, 'ENSCTL' => [ '001', '002', '003', '004'], # 'OBSMONITOR' => [ 'obstat'], # SLAFLAG: Forecast length to pick your perturbation end point from # SLAFDIFF: Hours difference to pick your perturbation start point from # SLAFLAG=24, SLAFDIFF=6 will use +24 - +18 # SLAFDIFF=SLAFLAG will retain the original SLAF construction # SLAFK should be tuned so that all members have the same perturbation size 'ENSBDMBR' => [ 1,2,3,4], # 'SLAFLAG' => [ 0, 6, 6, 12, 12, 18, 18, 24, 24, 30, 30], # 'SLAFDIFF' => [ 0, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6], # 'SLAFK' => ['0.0','1.75','-1.75','1.5','-1.5','1.2','-1.2','1.0','-1.0','0.9','-0.9'], # When using ECMWF ENS the members should be defined # # 'ENSBDMBR' => [ 0, 1..10], ### Normally NO NEED to change the settings below
Link to your newly generated Jb statistics from STEP1 :
Edit in
$HOME/hm_home/jbedaexp/scr/include.ass
as follows (example forDOMAIN=METCOOP25D
): In the section for your relevant domain point to the structure function stored in STEP one as follows:elif [ "$DOMAIN" = YOUR DOMAIN]; then JBDIR=${JBDIR-"ec:/hirlam/harmonie_jbdata"} JBDIR=ec:/$uid/jbdata f_JBCV=stabfiltn_your_exp.cv_jbconv.cv (without .gz) f_JBBAL=stabfiltn_your_exp.bal_jbconv.bal (without.gz)
Run for two one-month (30 day) periods:
cd $HOME/hm_home/jbedaexp ~hlam/Harmonie start DTG=2016060100 DTGEND=2016070100 # #~hlam/Harmonie start DTG=2017010100 DTGEND=2017013100
Generate the statistics using festat standalone:
- Place yourself at
$TEMP
on ECMWF - Copy Festat.standalone to
$TEMP
at ECMWF - Edit the script to reflect your user and experiment details (in particular copy femars data
ec:/$uid/harmonie/jbdownexp/femars/
to femars-directory on$TEMP
) - Make sure you have removed old femars_wrk directory and only have forecast differences from you EDA experiment in your femars directory As well preferably name files differently than in STEP 1 downscaling.
- Place yourself at
submit with
```bash
qsub ./Festat.standalone
```
you will get a log-file `festat.log` on `$TEMP` and results in directory `festat_wrk`
when the program has finished do:
```bash
cd festat_wrk
emkdir ec:/smx/jbdata (with smx replaced with your own user id)
gzip stab_your_eda_exp.cv
gzip stab_your_eda_exp.bal
ecp stab_your_eda_exp.cv.gz ec:/smx/jbdata/. (with your own filename and directory)
ecp stab_your_eda_exp.bal.gz ec:/smx/jbdata/. (with your own filename and directory)
```
also create a tar-file with all `*.xy`, `*.y`, `*.cv`, `*.bal` and `*.cvt` and put on ecfs for future diagnostical purposes) These new files are you final background error statistics to be diagnosed (compared with STEP 1 ones perhaps) and inserted to your data assimilation by modyfying `include.ass` (as in bullet 3 above) to point to your new files.
Diagnosis of background error statistics
Diagnosis of background error statistics is a rather complicated task. To get an idea of what the correlations and covariances should look like take a look in the article: Berre, L., 2000: Estimation of synoptic and meso scale forecast error covariances in a limited area model. Mon. Wea. Rev., 128, 644-667. Software for investigating and graphically illustrate different aspects of the background error statistics has been developed and statistics generated for different domains has been investigated using the AccordDaTools package. With this software you can also compare your newly generated background error statistics with the one generated for other HARMONIE domains. This will give you and idea if your statistics seems reasonable. For diagnosing the newly derived background error statistics follow these instructions:
Get the code and scripts:
- Download and install AccordDaTools following instructions in the README
- Don't forget to add the package tools directory to your PATH:
export PATH=/path/to/da_tools:$PATH
Run Jb diagnostics script:
- For example for a new domain using horizontal grid-spacing of 2500 m and (Harmonie) 65 vertical levels:
jbdiagnose -b jb_data/stab_IRELAND25_064_480.bal -c jb_data/stab_IRELAND25_064_480.cv -g 2500 -l harmL65 -e jbdiag_IRELAND25_064
- The output will be made written to
jbdiag_IRELAND25_064
- For example for a new domain using horizontal grid-spacing of 2500 m and (Harmonie) 65 vertical levels:
The AccordDaTools package also provides two tools for plotting the data produced by
jbdiagnose
,plotjbbal
andplotjbdiag
.plotjbbal
plots Jb balances for different parameters.plotjbdiag
produces spectral density (spdens) and vertical correlation (vercor) diagnostic plots for your structure funtions. For example:plotjbbal:
plotjbbal -t stdv -p QQ -r jbdiag_ -e IRELAND25_064
plotjbdiag:
plotjbdiag -l 50 -t vercor -p QQ -r jbdiag_ -e IRELAND25_064
Run 3DVAR/4DVAR with the new background error statistics
create
hm_home/jb_da
. Thencd $HOME/hm_home/jb_da
.create experiment by typing
~hlam/Harmonie setup -r ~hlam/harmonie_release/git/tags/harmonie-43h2.2.1
In
scr/include.ass
setJBDIR=ec:/$uid/jbdata
(uid being your userid, in this example 'ec:/smx/jbdata') andf_JBCV
is name of your.cv
file inec:/$uid/jbdata
(without.gz
) andf_JBBAL
is 'name of your.bal
file inec:/$uid/jbdata
(without.gz
) (in this example,f_JBCV=stab_METCOOPD_65_20200601_360.cv
,stab_METCOOPD_65_20200601_360.bal
). Add these three lines instead of the three lines ininclude.ass
that follows right after the elif statement:elif [ "$DOMAIN" = METCOOP25D]; then
. If domain is other thanMETCOOP25D
one has to look for the alternative name of the domain.From
$HOME/hm_home/jb_da
launch experiment by typing~hlam/Harmonie start DTG=2021010100 DTGEND=2021010103
The resulting analysis file be found under
$TEMP/hm_home/jb_da/archive/2021/01/01/03
and it will be calledMXMIN1999+0000
and on andec:/$uid/harmonie/2021/01/01/03
. To diagnose the 3D-VAR analysis increments of thejb_da
-experiment, copy the filesMXMIN1999+0000
(analysis) andICMSHHARM+0003
(fg) to$SCRATCH
. The first guess (background) file can be found on$TEMP/hm_home/jb_da/archive/2021/01/01/00
andec:/$uid/harmonie/jb_da/2021/01/01/00
. Convert from FA-file format to GRIB with the gl-software ($SCRATCH/hm_home/jb_da/bin/gl
) by typing./gl -p MXMIN1999+0000
and./gl -p ICMSHANAL+0000
. Then plot the difference between files file with your favorite software. Plot horizontal and vertical cross-sections of temperature and other variables using your favourite software (epygram for example).Now you have managed to insert the newly generated background error statistics to the assimilation system and managed to carry out a full scale data assimilation system and plot the analysis increments. The next natural step to further diagnose the background error statistics is to carry out a single observation impact experiment, utilizing your newly generated background error statistics. Note the variables
REDNMC
andREDZONE
ininclude.ass
.REDNMC
is the scaling factor for the background error statistics (default value 0.6/0.9) forMETCOOP25D/NEW_DOMAIN
).REDZONE
described how far from the lateral boundaries (in km) the observations need to be located to be assimilated (default value 150/100) forMETCOOP25D/NEW_DOMAIN
.
In-line Interpolation and Extrapolation of Jb-statistics
In case you do not have existing background error statistics derived for your domain there is a built technical possibility to use Jb-files from another domain derived with the same number of vertical levels. From this host Jb-files background error statistics are then interpolated or extrapolated to the current domain configuration. The assumption is then (which is in general questionable) that the statistics derived derived on the host domain is as well valid for the current domain. If the longest side of the host domain is shorter than the longest side of the current domain an extrapolation of background error covariance spectra is needed. Such extrapolation should be avoided over a wide range of wavenumbers. Therefore it is recommended that the longest side of the host Jb-file is as long or longer than the longest side of the current domain.The interpolation is invoked by in ecf/config_exp.h
set JB_INTERPOL=yeś
and JB_REF_DOMAIN=$HOST_JB
, where $HOST_JB
is for example METCOOP25B
. These settings will activate runnning of script jbconv.sh
(in case no Jb files present for current domain), called from Fetch_assim_data
.
On-going work & future developments
Recent and on-going work as well as plans for future developments:
- Ongoing-work regarding structure functions concerns investigations of effects on B statistics and data assimilation of the upper-level relaxation towards ecmwf at upper boundary condition through
LUNBC=.true.
Longer term research is towards flow dependent background error statistics and close link between the data assimilation and the ensemble forecasting system. Plans for future work also include adopting towards use of cy46 Festat.standalone, reading FA-files rather than femars-files. Here is a newly developed stand-alone tool for interpolation in of Jb-statistics as well between different vertical levels (not recommended) not yet publicly available and documented. Finally it should be mentioned that there are alternative methods toEDA
for carrying out STEP 2 of teh background error statistics derivation. Such alternatives areBRAND
andBREND
and these have been tested and compared withEDA
in various contexts, such as in reanalysis frameworks. The conclusion is that there are both pros and cons withBRAND
as compared withEDA
. The main conclusion is that bothEDA
andBRAND
are hampered by the homogeneity and isotrophy assumptions in 3DVAR/4DVAR framework, so that differences are smaller than in hybrid DA frameworks. Therefore continuedEDA/BRAND
comparisons are carried out withing hybrid ensemble/da frameworks. Nevertherless we aim here to include as well instructions for optionally replacing STEP 2EDA
in procedure above with STEP 2BRAND
. As well we aim for introducing instructions for using extended complementary diagnosis tools for Jb statistics using fediacov tool and associated plotting scripts. Such tools do exist, but not yet publicly available and documented
References
- festat_guidelines, Ryad El Katib, Meteo France, 2014
- festatforfa_guidelines, Ryad El Katib, Meteo France, 2016