5.2.3.1. Grid-Stat: Compute Anomaly Correlation using Climatology

Run the Grid-Stat and Stat-Analysis tools to anomaly statistics. (GFS:GFS:NCEP:Grib)

Scientific Objective

To provide useful statistical information on the relationship between observation data in gridded format to a gridded forecast. These values can be used to help correct model deviations from observed values.

Datasets

Forecast: GFS
Observation: GFS
climotology: NCEP
Location: All of the input data required for this use case can be found in the sample data tarball. Click here to download: https://github.com/NCAR/METplus/releases/download/v3.0/sample_data-medium_range-3.0.tgz
This tarball should be unpacked into the directory that you will set the value of INPUT_BASE. See ‘Running METplus’ section for more information.
Data Source: Unknown

METplus Components

This use case utilizes the METplus GridStat wrapper to search for files that are valid at a given run time and generate a command to run the MET tool grid_stat if all required files are found. Then StatAnalysis is run on the GridStat output.

METplus Workflow

GridStat and StatAnalysis are the tools called in this example. It processes the following run times:

Valid: 2017-06-13 0Z
Forecast lead: 24 hour
Valid: 2017-06-13 0Z
Forecast lead: 48 hour
Valid: 2017-06-13 6Z
Forecast lead: 24 hour
Valid: 2017-06-13 6Z
Forecast lead: 48 hour

METplus Configuration

METplus first loads all of the configuration files found in parm/metplus_config, then it loads any configuration files passed to METplus via the command line with the -c option, i.e. -c parm/use_cases/model_applications/medium_range/GridStat_fcstGFS_obsGFS_climoNCEP_MultiField.conf

# Grid to Grid Anomoly Example

[config]
# List of applications to run
PROCESS_LIST = GridStat, StatAnalysis

# time looping - options are INIT, VALID, RETRO, and REALTIME
LOOP_BY = VALID

# Format of VALID_BEG and VALID_END
VALID_TIME_FMT = %Y%m%d%H

# Start time for METplus run
VALID_BEG = 2017061300

# End time for METplus run
VALID_END = 2017061306

# Increment between METplus runs in seconds. Must be >= 60
VALID_INCREMENT = 21600

# list of forecast leads to process
LEAD_SEQ = 24, 48

# Options are times, processes
# times = run all items in the PROCESS_LIST for a single initialization
# time, then repeat until all times have been evaluated.
# processes = run each item in the PROCESS_LIST for all times
#   specified, then repeat for the next item in the PROCESS_LIST.
LOOP_ORDER = times

# list of variables to compare
BOTH_VAR1_NAME = TMP
BOTH_VAR1_LEVELS = P850, P500, P250

BOTH_VAR2_NAME = UGRD
BOTH_VAR2_LEVELS = P850, P500, P250

BOTH_VAR3_NAME = VGRD
BOTH_VAR3_LEVELS = P850, P500, P250

BOTH_VAR4_NAME = PRMSL
BOTH_VAR4_LEVELS = Z0

# description of data to be processed
# used in output file path
MODEL = GFS
OBTYPE = ANLYS

# location of grid_stat MET config file
GRID_STAT_CONFIG_FILE = {CONFIG_DIR}/GridStatConfig_anom

GRID_STAT_OUTPUT_PREFIX = {MODEL}_vs_{OBTYPE}

# variables to describe format of forecast data
FCST_IS_PROB = false

# variables to describe format of observation data
#  none needed

# StatAnalysis configuration
MODEL1 = GFS
MODEL1_OBTYPE = ANLYS

# configuration file to use with StatAnalysis
STAT_ANALYSIS_CONFIG_FILE = {PARM_BASE}/met_config/STATAnalysisConfig_wrapped

# stat_analysis job info
STAT_ANALYSIS_JOB_NAME = filter

# if using -dump_row, put in JOBS_ARGS "-dump_row [dump_row_file]"
# if using -out_stat, put in JOBS_ARGS "-out_stat [out_stat_file]"
# METplus will fill in filename
STAT_ANALYSIS_JOB_ARGS = -dump_row [dump_row_file]

# Optional variables for further filtering
# can be blank, single, or multiple values
# if more than one use comma separated list
#
# (FCST)(OBS)_(VALID)(INIT)_HOUR_LIST: HH format (ex. 00, 06, 12)
# (FCST)(OBS)_LEAD_LIST: HH[H][MMSS] format (ex. 00, 06, 120)
MODEL_LIST = {MODEL1}
DESC_LIST =
FCST_LEAD_LIST =
OBS_LEAD_LIST =
FCST_VALID_HOUR_LIST = 00, 06
FCST_INIT_HOUR_LIST = 00, 06
OBS_VALID_HOUR_LIST =
OBS_INIT_HOUR_LIST =
FCST_VAR_LIST =
OBS_VAR_LIST =
FCST_UNITS_LIST =
OBS_UNITS_LIST =
FCST_LEVEL_LIST =
OBS_LEVEL_LIST =
VX_MASK_LIST =
INTERP_MTHD_LIST =
INTERP_PNTS_LIST =
FCST_THRESH_LIST =
OBS_THRESH_LIST =
COV_THRESH_LIST =
ALPHA_LIST =
LINE_TYPE_LIST =
# how to treat items listed in above _LIST variables
# GROUP_LIST_ITEMS: items listed in a given _LIST variable
#                   will be grouped together
# LOOP_LIST_ITEMS:  items listed in a give _LIST variable
#                   will be looped over
# if not listed METplus will treat the list as a group
GROUP_LIST_ITEMS = FCST_INIT_HOUR_LIST
LOOP_LIST_ITEMS = FCST_VALID_HOUR_LIST, MODEL_LIST


[dir]
# location of configuration files used by MET applications
CONFIG_DIR={PARM_BASE}/use_cases/model_applications/medium_range

# directory containing climatology data
GRID_STAT_CLIMO_MEAN_INPUT_DIR = {INPUT_BASE}/model_applications/medium_range/grid_to_grid/nwprod/fix

# input and output data directories
FCST_GRID_STAT_INPUT_DIR = {INPUT_BASE}/model_applications/medium_range/grid_to_grid/gfs/fcst
OBS_GRID_STAT_INPUT_DIR = {INPUT_BASE}/model_applications/medium_range/grid_to_grid/gfs/obs
GRID_STAT_OUTPUT_DIR = {OUTPUT_BASE}/met_out/{MODEL}/anom

# directory to look for input for StatAnalysis
MODEL1_STAT_ANALYSIS_LOOKIN_DIR = {OUTPUT_BASE}/met_out/{MODEL1}/anom/*/grid_stat

# Output data directory
STAT_ANALYSIS_OUTPUT_DIR = {OUTPUT_BASE}/gather_by_date/stat_analysis/grid2grid/anom

[filename_templates]
# format of filenames

# Climatology mean
GRID_STAT_CLIMO_MEAN_INPUT_TEMPLATE = cmean_1d.1959{valid?fmt=%m%d}

# GFS
FCST_GRID_STAT_INPUT_TEMPLATE = pgbf{lead?fmt=%.3H}.gfs.{init?fmt=%Y%m%d%H}

# ANLYS
OBS_GRID_STAT_INPUT_TEMPLATE = pgbanl.gfs.{valid?fmt=%Y%m%d%H}

GRID_STAT_OUTPUT_TEMPLATE = {valid?fmt=%Y%m%d%H%M}/grid_stat

# Optional settings to create templated directory and file name information
# to save files as stat_analysis output as, this is appended to STAT_ANALYSIS_OUTPUT_DIR
# if no template is provided a default filename set in the code will be used
# Use:
# string templates can be set for all the lists being looped over, just
# use and a lower case version of the list, ex. {fcst_valid_hour?fmt=%H}
# or {fcst_var?fmt=%s}
# For looping over models:
# can set MODELn_STAT_ANALYSIS_[DUMP_ROW/OUT_STAT]_TEMPLATE for individual models
# or STAT_ANALYSIS_[DUMP_ROW/OUT_STAT] with {model?fmt=%s}
MODEL1_STAT_ANALYSIS_DUMP_ROW_TEMPLATE = {fcst_valid_hour?fmt=%H}Z/{MODEL1}/{MODEL1}_{valid?fmt=%Y%m%d}.stat

MET Configuration

METplus sets environment variables based on the values in the METplus configuration file. These variables are referenced in the MET configuration file. YOU SHOULD NOT SET ANY OF THESE ENVIRONMENT VARIABLES YOURSELF! THEY WILL BE OVERWRITTEN BY METPLUS WHEN IT CALLS THE MET TOOLS! If there is a setting in the MET configuration file that is not controlled by an environment variable, you can add additional environment variables to be set only within the METplus environment using the [user_env_vars] section of the METplus configuration files. See the ‘User Defined Config’ section on the ‘System Configuration’ page of the METplus User’s Guide for more information.

////////////////////////////////////////////////////////////////////////////////
//
// Grid-Stat configuration file.
//
// For additional information, see the MET_BASE/config/README file.
//
////////////////////////////////////////////////////////////////////////////////

//
// Output model name to be written
//
model = "${MODEL}";

//
// Output description to be written
// May be set separately in each "obs.field" entry
//
desc = "NA";

//
// Output observation type to be written
//
obtype = "${OBTYPE}";

////////////////////////////////////////////////////////////////////////////////

//
// Verification grid
//
regrid = {
   to_grid    = "G002";
   method     = BILIN;
   width      = 2;
   vld_thresh = 0.5;
   shape      = SQUARE;
}

////////////////////////////////////////////////////////////////////////////////

//
// May be set separately in each "field" entry
//
censor_thresh = [];
censor_val    = [];
cat_thresh    = [];
cnt_thresh    = [ NA ];
cnt_logic     = UNION;
wind_thresh   = [ NA ];
wind_logic    = UNION;
eclv_points   = 0.05;
nc_pairs_var_suffix = "";
nc_pairs_var_name = "";
rank_corr_flag   = FALSE;

//
// Forecast and observation fields to be verified
//
fcst = {
    field = [ ${FCST_FIELD} ];
    };

obs = {
    field = [ ${OBS_FIELD} ];
    };

////////////////////////////////////////////////////////////////////////////////

//
// Climatology data
//
climo_mean = fcst;
climo_mean = {
   file_name = [ ${CLIMO_MEAN_FILE} ];

   regrid = {
      method     = BILIN;
      width      = 2;
      vld_thresh = 0.5;
      shape      = SQUARE;
   }

   time_interp_method = NEAREST;
   match_month        = TRUE;
   match_day          = TRUE;
   time_step          = 21600;
}

climo_stdev = climo_mean;
climo_stdev = {
    file_name = [ ${CLIMO_STDEV_FILE} ];
}

climo_cdf_bins = 1;
write_cdf_bins = FALSE;

////////////////////////////////////////////////////////////////////////////////

//
// Verification masking regions
//
mask = {
   grid = [ "FULL" ];
   poly = [ "${INPUT_BASE}/model_applications/medium_range/poly/NHX.nc",
            "${INPUT_BASE}/model_applications/medium_range/poly/SHX.nc",
            "${INPUT_BASE}/model_applications/medium_range/poly/TRO.nc",
	    "${INPUT_BASE}/model_applications/medium_range/poly/PNA.nc" ];
}

////////////////////////////////////////////////////////////////////////////////

//
// Confidence interval settings
//
ci_alpha  = [ 0.05 ];

boot = {
   interval = PCTILE;
   rep_prop = 1.0;
   n_rep    = 0;
   rng      = "mt19937";
   seed     = "";
}

////////////////////////////////////////////////////////////////////////////////

//
// Data smoothing methods
//
interp = {
   field      = BOTH;
   vld_thresh = 1.0;
   shape      = SQUARE;

   type = [
      {
         method = NEAREST;
         width  = 1;
      }
   ];
}

////////////////////////////////////////////////////////////////////////////////

//
// Neighborhood methods
//
nbrhd = {
   field      = BOTH;
   width      = [ 1 ];
   cov_thresh = [ >=0.5 ];
   vld_thresh = 1.0;
   shape      = SQUARE;
}

////////////////////////////////////////////////////////////////////////////////

//
// Fourier decomposition
//
fourier = {
   wave_1d_beg = [];
   wave_1d_end = [];
}

////////////////////////////////////////////////////////////////////////////////

//
// Gradient statistics
// May be set separately in each "obs.field" entry
//
gradient = {
   dx = [ 1 ];
   dy = [ 1 ];
}

////////////////////////////////////////////////////////////////////////////////

//
// Statistical output types
//
output_flag = {
   fho    = NONE;
   ctc    = NONE;
   cts    = NONE;
   mctc   = NONE;
   mcts   = NONE;
   cnt    = NONE;
   sl1l2  = NONE;
   sal1l2 = STAT;
   vl1l2  = NONE;
   val1l2 = STAT;
   vcnt   = NONE;
   pct    = NONE;
   pstd   = NONE;
   pjc    = NONE;
   prc    = NONE;
   eclv   = NONE;
   nbrctc = NONE;
   nbrcts = NONE;
   nbrcnt = NONE;
   grad   = NONE;
}

//
// NetCDF matched pairs output file
//
nc_pairs_flag   = {
   latlon     = FALSE;
   raw        = FALSE;
   diff       = FALSE;
   climo      = FALSE;
   weight     = FALSE;
   nbrhd      = FALSE;
   fourier    = FALSE;
   gradient   = FALSE;
   apply_mask = FALSE;
}

////////////////////////////////////////////////////////////////////////////////

grid_weight_flag = COS_LAT;
tmp_dir          = "/tmp";
output_prefix    = "${OUTPUT_PREFIX}";
//version		 = "V9.0";

////////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////
//
// STAT-Analysis configuration file.
//
// For additional information, see the MET_BASE/config/README file.
//
////////////////////////////////////////////////////////////////////////////////

//
// Filtering input STAT lines by the contents of each column
//
model = [${MODEL}];
desc  = [${DESC}];

fcst_lead = [${FCST_LEAD}];
obs_lead  = [${OBS_LEAD}];

fcst_valid_beg  = "${FCST_VALID_BEG}";
fcst_valid_end  = "${FCST_VALID_END}";
fcst_valid_hour = [${FCST_VALID_HOUR}];

obs_valid_beg   = "${OBS_VALID_BEG}";
obs_valid_end   = "${OBS_VALID_END}";
obs_valid_hour  = [${OBS_VALID_HOUR}];

fcst_init_beg   = "${FCST_INIT_BEG}";
fcst_init_end   = "${FCST_INIT_END}";
fcst_init_hour  = [${FCST_INIT_HOUR}];

obs_init_beg    = "${OBS_INIT_BEG}";
obs_init_end    = "${OBS_INIT_END}";
obs_init_hour   = [${OBS_INIT_HOUR}];

fcst_var = [${FCST_VAR}];
obs_var  = [${OBS_VAR}];

fcst_units = [${FCST_UNITS}];
obs_units  = [${OBS_UNITS}];

fcst_lev = [${FCST_LEVEL}];
obs_lev  = [${OBS_LEVEL}];

obtype = [${OBTYPE}];

vx_mask = [${VX_MASK}];

interp_mthd = [${INTERP_MTHD}];

interp_pnts = [${INTERP_PNTS}];

fcst_thresh = [${FCST_THRESH}];
obs_thresh  = [${OBS_THRESH}];
cov_thresh  = [${COV_THRESH}];

alpha = [${ALPHA}];

line_type = [${LINE_TYPE}];

column = [];

weight = [];

////////////////////////////////////////////////////////////////////////////////

//
// Array of STAT-Analysis jobs to be performed on the filtered data
//
jobs = [
   "${JOB}"
   ];

////////////////////////////////////////////////////////////////////////////////

//
// Confidence interval settings
//
out_alpha = 0.05;

boot = {
   interval = PCTILE;
   rep_prop = 1.0;
   n_rep    = 0;
   rng      = "mt19937";
   seed     = "";
}

////////////////////////////////////////////////////////////////////////////////

//
// WMO mean computation logic
//
wmo_sqrt_stats   = [ "CNT:FSTDEV",  "CNT:OSTDEV",  "CNT:ESTDEV",
                     "CNT:RMSE",    "CNT:RMSFA",   "CNT:RMSOA",
                     "VCNT:FS_RMS", "VCNT:OS_RMS", "VCNT:RMSVE",
                     "VCNT:FSTDEV", "VCNT:OSTDEV" ];

wmo_fisher_stats = [ "CNT:PR_CORR", "CNT:SP_CORR",
                     "CNT:KT_CORR", "CNT:ANOM_CORR" ];

////////////////////////////////////////////////////////////////////////////////

rank_corr_flag = FALSE;
vif_flag       = FALSE;
tmp_dir        = "/tmp";
version        = "V9.0";

Note the following variables are referenced in the MET configuration file.

  • ${MODEL} - Name of forecast input. Corresponds to MODEL in the METplus configuration file.

  • ${OBTYPE} - Name of observation input. Corresponds to OBTYPE in the METplus configuration file.

  • ${FCST_FIELD} - Formatted forecast field information. Generated from [FCST/BOTH]_VAR<n>_[NAME/LEVEL/THRESH/OPTIONS] in the METplus configuration file.

  • ${OBS_FIELD} - Formatted observation field information. Generated from [OBS/BOTH]_VAR<n>_[NAME/LEVEL/THRESH/OPTIONS] in the METplus configuration file.

  • ${REGRID_TO_GRID} - Grid to remap data. Corresponds to GRID_STAT_REGRID_TO_GRID in the METplus configuration file.

  • ${VERIF_MASK} - Optional verification mask file or list of files. Corresponds to GRID_STAT_VERIFICATION_MASK_TEMPLATE in the METplus configuration file.

  • ${CLIMO_MEAN_FILE} - Optional path to climatology mean file. Corresponds to GRID_STAT_CLIMO_MEAN_INPUT_[DIR/TEMPLATE] in the METplus configuration file.

  • ${CLIMO_STDEV_FILE} - Optional path to climatology standard deviation file. Corresponds to GRID_STAT_CLIMO_STDEV_INPUT_[DIR/TEMPLATE] in the METplus configuration file.

  • ${NEIGHBORHOOD_SHAPE} - Shape of the neighborhood method applied. Corresponds to GRID_STAT_NEIGHBORHOOD_SHAPE in the METplus configuration file. Default value is 1 if not set.

  • ${NEIGHBORHOOD_WIDTH} - Width of the neighborhood method applied. Corresponds to GRID_STAT_NEIGHBORHOOD_WIDTH in the METplus configuration file. Default value is SQUARE if not set.

TODO: Add StatAnalysis environment variables

Running METplus

This use case can be run two ways:

  1. Passing in GridStat_fcstGFS_obsGFS_climoNCEP_MultiField.conf then a user-specific system configuration file:

    master_metplus.py -c /path/to/METplus/parm/use_cases/model_applications/medium_range/GridStat_fcstGFS_obsGFS_climoNCEP_MultiField.conf -c /path/to/user_system.conf
    
  2. Modifying the configurations in parm/metplus_config, then passing in GridStat_fcstGFS_obsGFS_climoNCEP_MultiField.conf:

    master_metplus.py -c /path/to/METplus/parm/use_cases/model_applications/medium_range/GridStat_fcstGFS_obsGFS_climoNCEP_MultiField.conf
    

The former method is recommended. Whether you add them to a user-specific configuration file or modify the metplus_config files, the following variables must be set correctly:

  • INPUT_BASE - Path to directory where sample data tarballs are unpacked (See Datasets section to obtain tarballs). This is not required to run METplus, but it is required to run the examples in parm/use_cases

  • OUTPUT_BASE - Path where METplus output will be written. This must be in a location where you have write permissions

  • MET_INSTALL_DIR - Path to location where MET is installed locally

Example User Configuration File:

[dir]
INPUT_BASE = /path/to/sample/input/data
OUTPUT_BASE = /path/to/output/dir
MET_INSTALL_DIR = /path/to/met-X.Y

NOTE: All of these items must be found under the [dir] section.

Expected Output

A successful run will output the following both to the screen and to the logfile:

INFO: METplus has successfully finished running.

Refer to the value set for OUTPUT_BASE to find where the output data was generated. Output for this use case will be found in gather_by_date/stat_analysis/grid2grid/anom (relative to OUTPUT_BASE) and will contain the following files:

  • 00Z/GFS/GFS_20170613.stat

  • 06Z/GFS/GFS_20170613.stat