Derivation of Structure Functions

General

For each new model domain, in order to carry out upper air data assimilation (3DVAR or 4DVAR) one needs to generate background error covariances (generally referred to as structure functions). The recommended procedure is to use a two step approach. In step one you generate background error statistics by downscaling (this is needed since you do not have have statistics for your domain setup for this forecast model version and physics options, so that you cannot run data-assimilation (unless you use statistics from old system possibly derived from a slighthly different domain and with a different model version, which is not recommended). In step 2 you then use the statistics derived in step 1 to generate the final background error statistics files by applying ensemble data assimilation within the HARMONIE-AROME modelling system.

In step 1 structure functions are generated from difference fields from ensemble members of HARMONIE-AROME forecast. These are obtained from downscaling of ECMWF EDA ensemble forecast. To alleviate spin-up issues, these phase 1 downscaled HARMONIE-AROME forecasts are run up to 6 hours, and differences are generated from these. Using the ECMWF LBC data, 6h HARMONIE ensemble forecasts are initiated from ECMWF 6h forecasts daily from 00 UTC and 12 UTC, with ECMWF forecasts as initial and lateral boundary conditions. To obtain stable statistics, it is recomended to run 4 ensembles for two chosen one-month episode (s). The episodes should sample different seasons. Therefore it is recommended to run for one winter month and one summer month, for example June 2016 and January 2017. These periods are chosen so as to benefit from the latest upgrade to ECMWF's EDA system. Thereby we sample both seasonal (January, July) and daily (00 UTC and 12 UTC) variations. After running of the ensembles the archived results (6h forecasts) are processed to generate structure functions by running a program called 'festat'. Festat will be run automatically within the SMS system when DTGEND is approached by an experiment and the statistics will be based on difference files generated by intermediate program femars and stored on ecfs in ec:/$uid/harmonie/$exp/femars (software to generate binary GRIB files of forecasts differences after each cycle). This will mean that if you start by running a one month experiment for January the structure functions generated when you reach DTGEND will be for January. When you use the same experiment name and launch also an experiment for July you will when you reach DTGEND have background error statistics based on both January and July differences files (since both of those are now found in ec:/$uid/harmonie/$exp/femars). These combined winter/summer background error statistics files from phase one are final product from step 1 and can are the intermediate background error statistics files to plug into the HARMONIE-AROME data assimilation of step 2. It should be mentioned that there is a possibility for the more advanced user to run festat off-line and with any combinations of January-July forecast difference files from ec:/$uid/harmonie/$exp/femars. That will be described in ore detail further below and is something you might want to do with forecasts difference files generated from step 2 to produce monthly background error statistics files by combining in different ways.

In step 2 we run again two one-month ensemble experiments for the same Januar and July months again utilizing ECMWF EDA forecasts as lateral boundary conditions. Again you use 4 ensemble members. The important difference as compared to step 1 is that you now carry out ensemble-data assimilation also within the HARMONIE-AROME framework. You use the background error statistics from phase 1 and do the eda within a data assimilation cycle. This has the important advantage that you significantly reduce spinup caused by the HARMONIE-AROME model adjustments to ECMWF EDA starting initial states. Because of this we can in step 2 derive the statistics from +3h forecast difference (rather than +6 that is used in step 1).

Note that there are methods to circumvent step1 and to technically run 3/4DVAR using structure functions derived from another HARMONIE model domain. Such existing methods include aspects such as horizontal truncation or extrapolation of horizontal spectra and possibly vertical interpolation in between vertical level geometries. Since the recommended procedure is to use the two stp approach described above these alternative methods are not described in detail. Furthermore it should be noted that there are background error covariance related tuning coefficients REDNMC and REDZONE. Settings of values of these ae not covered here. If you have a new domain you will use the default value 0.6 for REDNMC and 100 for REDZONE which are considered apropiate values for the derivation of structure functions. If you re-derive your statistics for an existing domain you will use the REDNMC and REDZONE values as assigned in scr/include.ass.

There are various existing tools for investigating your newly derived structure functions and at the end of this page there are some documentation of existing tools and how to use them.

The procedure for generating structure functions from an ensemble of forecasts is described below for a AROME setup with 2.5 km horizontal resolution and 65 vertical levels. The experiment is run for a one mont winter-period of followed by a one month summer-period on the ECMWF computing system. Forecast differences are derived twice a day (00 forecasts from 12 UTC) from combinations of the four ensemble members. Besides the scientific recommendation to cover many different weather situations there is as well a matemathical constraint that the number of forecast difference files provided to festat needs to be larger than the number of vertical levels used in the forecast model integration. In the section below detailed instructions on how to generate the structure functions are given. The other sections deals with how to diagnose the structure functions recent and ongoing work and future development plans.

It is recommended for future coming enhancements regarding handling of B statistics and diagnostics of it to save all generated forecast difference files as well as stabal.cv, stabal.cvt and stabal.bal and generated .xy and .y files (.cvt .xy and .y for diagnotical puroposes):

Generating background error statistics (using 43h2.2)

The following instructions are valid for trunk and any 43h2.2 tags that have been created. These instructions will only work at ECMWF. If you do have a new domain (or are not sure) you should follow that route in step 1 below. New domain creation is described in ModelDomain which links to the useful Domain Creation Tool

STEP 1 Downscaling

  1. Create a new experiment on ECMWF:

    In case you do have an existing domain setup do:

    mkdir -p $HOME/hm_home/jbdownexp
    cd $HOME/hm_home/jbdownexp
    ~hlam/Harmonie setup -c JBDSC -r ~hlam/harmonie_release/git/tags/harmonie-43h2.2.1 -d DOMAIN # where domain is the    name of your domain

    In case you are creating structure functions for a new domain (or you are not sure):

    mkdir -p $HOME/hm_home/jbdownexp
    cd $HOME/hm_home/jbdownexp
    ~hlam/Harmonie setup -c JBDSC -r ~hlam/harmonie_release/git/tags/harmonie-43h2.2.1
    ~hlam/Harmonie co scr/Harmonie_domains.pm

    Then edit scr/Harmonie_domains.pm and add your new domain definition.

  2. The ensemble that will be used to generate the structure functions needs to be defined in suites/harmonie.pm. An edited ensemble configuration file should define a four member ensemble that only varies the boundary memeber input (ENSBDMBR) as follows:

    %env = (
    #   'ANAATMO'  => { 0 => '3DVAR' },
    #   'HWRITUPTIMES' => { 0 => '00-21:3,24-60:6' },
    #   'SWRITUPTIMES' => { 0 => '00-06:3' },
    #   'HH_LIST' => { 0 => '00-21:3' },
    #   'LL_LIST' => { 0 => '36,3' },
    #   'LSMIXBC'  => { 0 => 'no' },
    #   'ANASURF'  => { 0 => 'CANARI_OI_MAIN' },
       'ENSCTL'   => [ '001', '002', '003', '004'],
    #   'OBSMONITOR' => [ 'obstat'],
    # SLAFLAG: Forecast length to pick your perturbation end point from
    # SLAFDIFF: Hours difference to pick your perturbation start point from
    # SLAFLAG=24, SLAFDIFF=6 will use +24 - +18
    # SLAFDIFF=SLAFLAG will retain the original SLAF construction
    # SLAFK should be tuned so that all members have the same perturbation size
       'ENSBDMBR' => [ 1,2,3,4],
    #   'SLAFLAG'  => [    0,    6,     6,    12,    12,  18,     18,   24,    24,    30,    30],
    #   'SLAFDIFF' => [    0,    6,     6,     6,     6,    6,     6,    6,     6,     6,     6],
    #   'SLAFK'    => ['0.0','1.75','-1.75','1.5','-1.5','1.2','-1.2','1.0','-1.0','0.9','-0.9'],
    # When using ECMWF ENS the members should be defined
    #   # 'ENSBDMBR' => [ 0, 1..10],
    
    ### Normally NO NEED to change the settings below
  3. Run for two one-month (30 day) periods:

    cd $HOME/hm_home/jbdownexp
    ~hlam/Harmonie start DTG=2016060100 DTGEND=2016070100
    #
    #~hlam/Harmonie start DTG=2017010100 DTGEND=2017013100
  4. Generate the statistics using festat standalone:

    • Place yourself at $TEMP on ECMWF

    • Copy Festat.standalone to $TEMP on ECMWF

    • Edit the script to reflect your user and experiment details (in particular copy femars data ec:/$uid/harmonie /jbdownexp/femars/ to your femars-directory on $TEMP)

      submit with

      qsub ./Festat.standalone

      you will get a log-file festat.log on $TEMP and results in directory festat_wrk. when the program has finished do:

      cd festat_wrk
      emkdir ec:/$uid/jbdata
      gzip stab_your_exp.cv
      gzip stab_your_exp.bal
      ecp stab_your_exp.cv.gz ec:/$uid/jbdata/.  (with your own filename and directory)
      ecp stab_your_exp.bal.gz ec:/$uid/jbdata/. (with your own filename and directory)

    (also create a tar.file with all *.xy *.y *.cv, *.bal and *.cvt and put on ecfs for future diagnostical purposes)

STEP 2 Generating background error statistics with EDA cycling (instructions under testing)

  1. Create a new experiment on ECMWF:

    In case you do have an existing domain setup do:

    mkdir -p $HOME/hm_home/jbedaexp
    cd $HOME/hm_home/jbedaexp
    ~hlam/Harmonie setup -c AROME_JBEDA -r ~hlam/harmonie_release/git/tags/harmonie-43h2.2.1 -d DOMAIN # where domain    is the name of your domain

    In case you are creating structure functions for a new domain (or you are not sure):

    mkdir -p $HOME/hm_home/jbedanexp
    cd $HOME/hm_home/jbedaexp
    ~hlam/Harmonie setup -c AROME_JBEDA -r ~hlam/harmonie_release/git/tags/harmonie-43h2.2.1
    ~hlam/Harmonie co scr/Harmonie_domains.pm

    Then edit scr/Harmonie_domains.pm and add your new domain definition.

  2. The ensemble that will be used to generate the structure functions needs to be defined in suites/harmonie.pm. An edited ensemble configuration file should define a four member ensemble that only varies the boundary memeber input (ENSBDMBR) as follows:

    %env = (
    #   'ANAATMO'  => { 0 => '3DVAR' },
    #   'HWRITUPTIMES' => { 0 => '00-21:3,24-60:6' },
    #   'SWRITUPTIMES' => { 0 => '00-06:3' },
    #   'HH_LIST' => { 0 => '00-21:3' },
    #   'LL_LIST' => { 0 => '36,3' },
    #   'LSMIXBC'  => { 0 => 'no' },
    #   'ANASURF'  => { 0 => 'CANARI_OI_MAIN' },
       'ENSCTL'   => [ '001', '002', '003', '004'],
    #   'OBSMONITOR' => [ 'obstat'],   
    # SLAFLAG: Forecast length to pick your perturbation end point from
    # SLAFDIFF: Hours difference to pick your perturbation start point from
    # SLAFLAG=24, SLAFDIFF=6 will use +24 - +18
    # SLAFDIFF=SLAFLAG will retain the original SLAF construction
    # SLAFK should be tuned so that all members have the same perturbation size
       'ENSBDMBR' => [ 1,2,3,4],
    #   'SLAFLAG'  => [    0,    6,     6,    12,    12,  18,     18,   24,    24,    30,    30],
    #   'SLAFDIFF' => [    0,    6,     6,     6,     6,    6,     6,    6,     6,     6,     6],
    #   'SLAFK'    => ['0.0','1.75','-1.75','1.5','-1.5','1.2','-1.2','1.0','-1.0','0.9','-0.9'],
    # When using ECMWF ENS the members should be defined
    #   # 'ENSBDMBR' => [ 0, 1..10],
    
    ### Normally NO NEED to change the settings below
  3. Link to your newly generated Jb statistics from STEP1 :

    Edit in $HOME/hm_home/jbedaexp/scr/include.ass as follows (example for DOMAIN=METCOOP25D): In the section for your relevant domain point to the structure function stored in STEP one as follows:

     elif [ "$DOMAIN" = YOUR DOMAIN]; then
       JBDIR=${JBDIR-"ec:/hirlam/harmonie_jbdata"}
       JBDIR=ec:/$uid/jbdata
       f_JBCV=stabfiltn_your_exp.cv_jbconv.cv (without .gz)
       f_JBBAL=stabfiltn_your_exp.bal_jbconv.bal (without.gz)
  4. Run for two one-month (30 day) periods:

    cd $HOME/hm_home/jbedaexp
    ~hlam/Harmonie start DTG=2016060100 DTGEND=2016070100
    #
    #~hlam/Harmonie start DTG=2017010100 DTGEND=2017013100
  5. Generate the statistics using festat standalone:

    • Place yourself at $TEMP on ECMWF
    • Copy Festat.standalone to $TEMP at ECMWF
    • Edit the script to reflect your user and experiment details (in particular copy femars data ec:/$uid/harmonie/jbdownexp/femars/ to femars-directory on $TEMP)
    • Make sure you have removed old femars_wrk directory and only have forecast differences from you EDA experiment in your femars directory As well preferably name files differently than in STEP 1 downscaling.
 submit with 

 ```bash
 qsub ./Festat.standalone
 ```
 
 you will get a log-file `festat.log` on `$TEMP` and results in directory `festat_wrk`
 when the program has finished do:
 
 ```bash
 cd festat_wrk
 emkdir ec:/smx/jbdata (with smx replaced with your own user id) 
 gzip stab_your_eda_exp.cv
 gzip stab_your_eda_exp.bal
 ecp stab_your_eda_exp.cv.gz ec:/smx/jbdata/.  (with your own filename and directory)
 ecp stab_your_eda_exp.bal.gz ec:/smx/jbdata/. (with your own filename and directory)
 ```

 also create a tar-file with all `*.xy`, `*.y`, `*.cv`, `*.bal` and `*.cvt` and put on ecfs for future diagnostical purposes) These new files are you final background error statistics to be diagnosed (compared with STEP 1 ones perhaps) and inserted to your data assimilation by modyfying `include.ass` (as in bullet 3 above) to point to your new files.

Diagnosis of background error statistics

  1. Diagnosis of background error statistics is a rather complicated task. To get an idea of what the correlations and covariances should look like take a look in the article: Berre, L., 2000: Estimation of synoptic and meso scale forecast error covariances in a limited area model. Mon. Wea. Rev., 128, 644-667. Software for investigating and graphically illustrate different aspects of the background error statistics has been developed and statistics generated for different domains has been investigated using the AccordDaTools package. With this software you can also compare your newly generated background error statistics with the one generated for other HARMONIE domains. This will give you and idea if your statistics seems reasonable. For diagnosing the newly derived background error statistics follow these instructions:

  2. Get the code and scripts:

    • Download and install AccordDaTools following instructions in the README
    • Don't forget to add the package tools directory to your PATH:
    • export PATH=/path/to/da_tools:$PATH
  3. Run Jb diagnostics script:

    • For example for a new domain using horizontal grid-spacing of 2500 m and (Harmonie) 65 vertical levels:
      jbdiagnose -b jb_data/stab_IRELAND25_064_480.bal -c jb_data/stab_IRELAND25_064_480.cv -g 2500 -l harmL65 -e jbdiag_IRELAND25_064
    • The output will be made written to jbdiag_IRELAND25_064
  1. The AccordDaTools package also provides two tools for plotting the data produced by jbdiagnose, plotjbbal and plotjbdiag. plotjbbal plots Jb balances for different parameters. plotjbdiag produces spectral density (spdens) and vertical correlation (vercor) diagnostic plots for your structure funtions. For example:

    • plotjbbal:

      plotjbbal -t stdv -p QQ -r jbdiag_ -e IRELAND25_064
    • plotjbdiag:

      plotjbdiag -l 50 -t vercor -p QQ -r jbdiag_ -e IRELAND25_064

Run 3DVAR/4DVAR with the new background error statistics

  1. create hm_home/jb_da. Then cd $HOME/hm_home/jb_da.

  2. create experiment by typing

    ~hlam/Harmonie setup -r ~hlam/harmonie_release/git/tags/harmonie-43h2.2.1
  3. In scr/include.ass set JBDIR=ec:/$uid/jbdata (uid being your userid, in this example 'ec:/smx/jbdata') and f_JBCV is name of your .cv file in ec:/$uid/jbdata (without .gz) and f_JBBAL is 'name of your .bal file in ec:/$uid/jbdata (without .gz) (in this example, f_JBCV=stab_METCOOPD_65_20200601_360.cv, stab_METCOOPD_65_20200601_360.bal). Add these three lines instead of the three lines in include.ass that follows right after the elif statement: elif [ "$DOMAIN" = METCOOP25D]; then. If domain is other than METCOOP25D one has to look for the alternative name of the domain.

  4. From $HOME/hm_home/jb_da launch experiment by typing

    ~hlam/Harmonie start DTG=2021010100 DTGEND=2021010103
  5. The resulting analysis file be found under $TEMP/hm_home/jb_da/archive/2021/01/01/03 and it will be called MXMIN1999+0000 and on and ec:/$uid/harmonie/2021/01/01/03. To diagnose the 3D-VAR analysis increments of the jb_da-experiment, copy the files MXMIN1999+0000 (analysis) and ICMSHHARM+0003 (fg) to $SCRATCH. The first guess (background) file can be found on $TEMP/hm_home/jb_da/archive/2021/01/01/00 and ec:/$uid/harmonie/jb_da/2021/01/01/00. Convert from FA-file format to GRIB with the gl-software ($SCRATCH/hm_home/jb_da/bin/gl) by typing ./gl -p MXMIN1999+0000 and ./gl -p ICMSHANAL+0000. Then plot the difference between files file with your favorite software. Plot horizontal and vertical cross-sections of temperature and other variables using your favourite software (epygram for example).

  6. Now you have managed to insert the newly generated background error statistics to the assimilation system and managed to carry out a full scale data assimilation system and plot the analysis increments. The next natural step to further diagnose the background error statistics is to carry out a single observation impact experiment, utilizing your newly generated background error statistics. Note the variables REDNMC and REDZONE in include.ass. REDNMC is the scaling factor for the background error statistics (default value 0.6/0.9) for METCOOP25D/NEW_DOMAIN). REDZONE described how far from the lateral boundaries (in km) the observations need to be located to be assimilated (default value 150/100) for METCOOP25D/NEW_DOMAIN.

In-line Interpolation and Extrapolation of Jb-statistics

In case you do not have existing background error statistics derived for your domain there is a built technical possibility to use Jb-files from another domain derived with the same number of vertical levels. From this host Jb-files background error statistics are then interpolated or extrapolated to the current domain configuration. The assumption is then (which is in general questionable) that the statistics derived derived on the host domain is as well valid for the current domain. If the longest side of the host domain is shorter than the longest side of the current domain an extrapolation of background error covariance spectra is needed. Such extrapolation should be avoided over a wide range of wavenumbers. Therefore it is recommended that the longest side of the host Jb-file is as long or longer than the longest side of the current domain.The interpolation is invoked by in ecf/config_exp.h set JB_INTERPOL=yeś and JB_REF_DOMAIN=$HOST_JB, where $HOST_JB is for example METCOOP25B. These settings will activate runnning of script jbconv.sh (in case no Jb files present for current domain), called from Fetch_assim_data.

On-going work & future developments

Recent and on-going work as well as plans for future developments:

  • Ongoing-work regarding structure functions concerns investigations of effects on B statistics and data assimilation of the upper-level relaxation towards ecmwf at upper boundary condition through LUNBC=.true. Longer term research is towards flow dependent background error statistics and close link between the data assimilation and the ensemble forecasting system. Plans for future work also include adopting towards use of cy46 Festat.standalone, reading FA-files rather than femars-files. Here is a newly developed stand-alone tool for interpolation in of Jb-statistics as well between different vertical levels (not recommended) not yet publicly available and documented. Finally it should be mentioned that there are alternative methods to EDA for carrying out STEP 2 of teh background error statistics derivation. Such alternatives are BRAND and BREND and these have been tested and compared with EDA in various contexts, such as in reanalysis frameworks. The conclusion is that there are both pros and cons with BRAND as compared with EDA. The main conclusion is that both EDAand BRAND are hampered by the homogeneity and isotrophy assumptions in 3DVAR/4DVAR framework, so that differences are smaller than in hybrid DA frameworks. Therefore continued EDA/BRAND comparisons are carried out withing hybrid ensemble/da frameworks. Nevertherless we aim here to include as well instructions for optionally replacing STEP 2 EDA in procedure above with STEP 2 BRAND. As well we aim for introducing instructions for using extended complementary diagnosis tools for Jb statistics using fediacov tool and associated plotting scripts. Such tools do exist, but not yet publicly available and documented

References