GridStat: Read Zarr Files and Compute Statistics

model_applications/short_range/GridStat_fcstHRRRCast_obsHRRRanal_zarr.conf

Scientific Objective

This use case demonstrates how to read Zarr format data from the HRRRCast model. Four fields, 2m temperature, reflectivity, and 200mb U and V are read into the Grid-Stat tool, and continuous, categorical, and vector statistics are computed. The purpose of this use case is to demonstrate how to use Zarr format data for Artificial Intelligence Weather Prediction models.

Version Added

METplus version 13.0

Datasets

Forecast: HRRRCast

Observation: HRRR Analysis

Climatology: None

Location: All of the input data required for this use case can be found in a sample data tarball. Each use case category will have one or more sample data tarballs. It is only necessary to download the tarball with the use case’s dataset and not the entire collection of sample data. Click here to access the METplus releases page and download sample data for the appropriate release: https://github.com/dtcenter/METplus/releases This tarball should be unpacked into the directory that you will set the value of INPUT_BASE. See Running METplus section for more information.

METplus Components

This use case calls Grid-Stat once.

METplus Workflow

Beginning time (INIT_BEG): 2024-05-02 00 UTC

End time (INIT_END): 2024-05-02 01 UTC

Increment between beginning and end times (INIT_INCREMENT): 1 hour

Sequence of forecast leads to process (LEAD_SEQ): 1 - 9 using 1 hour increments, and 12 - 18 using 3 hour increments

Starting with the 00 UTC initialization on 2024-05-02, 2 model initializations are processed ending with the run initialized at 01 UTC on 2024-05-02. For each initialization, 12 lead times are processed, for a total of 24 Grid-Stat runs.

METplus Configuration

METplus first loads all of the configuration files found in parm/metplus_config, then it loads any configuration files passed to METplus via the command line, e.g. parm/use_cases/model_applications/short_range/GridStat_fcstHRRRCast_obsHRRRanal_zarr.conf

[config]

# Documentation for this use case can be found at
# https://metplus.readthedocs.io/en/latest/generated/model_applications/short_range/GridStat_fcstHRRRCast_obsHRRRanal_zarr.html

# For additional information, please see the METplus Users Guide.
# https://metplus.readthedocs.io/en/latest/Users_Guide

###
# Processes to run
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#process-list
###

PROCESS_LIST = GridStat


###
# Time Info
# LOOP_BY options are INIT, VALID, RETRO, and REALTIME
# If set to INIT or RETRO:
#   INIT_TIME_FMT, INIT_BEG, INIT_END, and INIT_INCREMENT must also be set
# If set to VALID or REALTIME:
#   VALID_TIME_FMT, VALID_BEG, VALID_END, and VALID_INCREMENT must also be set
# LEAD_SEQ is the list of forecast leads to process
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#timing-control
###

LOOP_BY = INIT
INIT_TIME_FMT = %Y%m%d%H
INIT_BEG = 2024050200
INIT_END = 2024050201
INIT_INCREMENT = 1H

LEAD_SEQ = 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 15, 18


###
# File I/O
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#directory-and-filename-template-info
###


FCST_GRID_STAT_INPUT_DIR = {INPUT_BASE}/model_applications/short_range/GridStat_fcstHRRRCast_obsHRRRanal_zarr/data-10days/reshrrr_predictions_mem0.zarr
FCST_GRID_STAT_INPUT_TEMPLATE = PYTHON_NUMPY

OBS_GRID_STAT_INPUT_DIR = {INPUT_BASE}/model_applications/short_range/GridStat_fcstHRRRCast_obsHRRRanal_zarr/HRRR_obs
OBS_GRID_STAT_INPUT_TEMPLATE = {valid?fmt=%Y%m%d}/hrrr.t{valid?fmt=%H}z.wrfprsf00.grib2


GRID_STAT_OUTPUT_DIR = {OUTPUT_BASE}/model_applications/short_range/GridStat_fcstHRRRCast_obsHRRRanal_zarr/grid_stat
GRID_STAT_OUTPUT_TEMPLATE = {init?fmt=%Y%m%d%H}


###
# Field Info
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#field-info
###

MODEL = HRRRCast
OBTYPE = HRRR

FCST_GRID_STAT_VAR1_NAME = {PARM_BASE}/use_cases/model_applications/short_range/GridStat_fcstHRRRCast_obsHRRRanal_zarr/read_zarr_HRRRCast.py {FCST_GRID_STAT_INPUT_DIR} {init?fmt=%Y%m%d_%H%M%S} {lead?fmt=%HH} T2M Z2

FCST_GRID_STAT_VAR2_NAME = {PARM_BASE}/use_cases/model_applications/short_range/GridStat_fcstHRRRCast_obsHRRRanal_zarr/read_zarr_HRRRCast.py {FCST_GRID_STAT_INPUT_DIR} {init?fmt=%Y%m%d_%H%M%S} {lead?fmt=%HH} REFC L0
FCST_GRID_STAT_VAR2_THRESH = >=20, >=30, >=40

FCST_GRID_STAT_VAR3_NAME = {PARM_BASE}/use_cases/model_applications/short_range/GridStat_fcstHRRRCast_obsHRRRanal_zarr/read_zarr_HRRRCast.py {FCST_GRID_STAT_INPUT_DIR} {init?fmt=%Y%m%d_%H%M%S} {lead?fmt=%HH} UGRD P200
FCST_GRID_STAT_VAR3_OPTIONS = is_u_wind = TRUE;

FCST_GRID_STAT_VAR4_NAME = {PARM_BASE}/use_cases/model_applications/short_range/GridStat_fcstHRRRCast_obsHRRRanal_zarr/read_zarr_HRRRCast.py {FCST_GRID_STAT_INPUT_DIR} {init?fmt=%Y%m%d_%H%M%S} {lead?fmt=%HH} VGRD P200
FCST_GRID_STAT_VAR4_OPTIONS = is_v_wind = TRUE;

FCST_IS_PROB = false

OBS_GRID_STAT_VAR1_NAME = TMP
OBS_GRID_STAT_VAR1_LEVELS = Z2

OBS_GRID_STAT_VAR2_NAME = REFC
OBS_GRID_STAT_VAR2_LEVELS = L0
OBS_GRID_STAT_VAR2_THRESH = >=20, >=30, >=40

OBS_GRID_STAT_VAR3_NAME = UGRD
OBS_GRID_STAT_VAR3_LEVELS = P200
OBS_GRID_STAT_VAR3_OPTIONS = is_u_wind = TRUE;

OBS_GRID_STAT_VAR4_NAME = VGRD
OBS_GRID_STAT_VAR4_LEVELS = P200
OBS_GRID_STAT_VAR4_OPTIONS = is_v_wind = TRUE;


GRID_STAT_ONCE_PER_FIELD = False


###
# GridStat Settings (optional)
# https://metplus.readthedocs.io/en/latest/Users_Guide/wrappers.html#gridstat
###

#LOG_GRID_STAT_VERBOSITY = 2

GRID_STAT_CONFIG_FILE = {PARM_BASE}/met_config/GridStatConfig_wrapped

GRID_STAT_REGRID_TO_GRID = FCST

GRID_STAT_DESC = NA

GRID_STAT_NEIGHBORHOOD_WIDTH = 1
GRID_STAT_NEIGHBORHOOD_SHAPE = SQUARE

GRID_STAT_NEIGHBORHOOD_COV_THRESH = >=0.5

GRID_STAT_OUTPUT_PREFIX = {MODEL}_vs_{OBTYPE}

GRID_STAT_OUTPUT_FLAG_CTC = STAT
GRID_STAT_OUTPUT_FLAG_CTS = STAT
GRID_STAT_OUTPUT_FLAG_SL1L2 = STAT
GRID_STAT_OUTPUT_FLAG_CNT = STAT
GRID_STAT_OUTPUT_FLAG_VL1L2 = STAT
GRID_STAT_OUTPUT_FLAG_VCNT = STAT

GRID_STAT_NC_PAIRS_FLAG_LATLON = FALSE
GRID_STAT_NC_PAIRS_FLAG_RAW = FALSE
GRID_STAT_NC_PAIRS_FLAG_DIFF = FALSE
GRID_STAT_NC_PAIRS_FLAG_CLIMO = FALSE
GRID_STAT_NC_PAIRS_FLAG_APPLY_MASK = FALSE

GRID_STAT_VERIFICATION_MASK_TEMPLATE =

MET Configuration

METplus sets environment variables based on user settings in the METplus configuration file. See How METplus controls MET config file settings for more details.

YOU SHOULD NOT SET ANY OF THESE ENVIRONMENT VARIABLES YOURSELF! THEY WILL BE OVERWRITTEN BY METPLUS WHEN IT CALLS THE MET TOOLS!

If there is a setting in the MET configuration file that is currently not supported by METplus you’d like to control, please refer to: Overriding Unsupported MET config file settings

GridStatConfig_wrapped
////////////////////////////////////////////////////////////////////////////////
//
// Grid-Stat configuration file.
//
// For additional information, see the MET_BASE/config/README file.
//
////////////////////////////////////////////////////////////////////////////////

//
// Output model name to be written
//
// model =
${METPLUS_MODEL}

//
// Output description to be written
// May be set separately in each "obs.field" entry
//
// desc =
${METPLUS_DESC}

//
// Output observation type to be written
//
// obtype =
${METPLUS_OBTYPE}

////////////////////////////////////////////////////////////////////////////////

//
// Verification grid
//
// regrid = {
${METPLUS_REGRID_DICT}

////////////////////////////////////////////////////////////////////////////////

//censor_thresh =
${METPLUS_CENSOR_THRESH}
//censor_val =
${METPLUS_CENSOR_VAL}
//cat_thresh =
${METPLUS_CAT_THRESH}
cnt_thresh  	 = [ NA ];
cnt_logic   	 = UNION;
wind_thresh 	 = [ NA ];
wind_logic  	 = UNION;
eclv_points      = 0.05;
//nc_pairs_var_name =
${METPLUS_NC_PAIRS_VAR_NAME}
nc_pairs_var_suffix = "";
//hss_ec_value =
${METPLUS_HSS_EC_VALUE}

rank_corr_flag   = FALSE;

//
// Forecast and observation fields to be verified
//
fcst = {
  ${METPLUS_FCST_FILE_TYPE}
  ${METPLUS_FCST_FIELD}
  ${METPLUS_FCST_CLIMO_MEAN_DICT}
  ${METPLUS_FCST_CLIMO_STDEV_DICT}
}
obs = {
  ${METPLUS_OBS_FILE_TYPE}
  ${METPLUS_OBS_FIELD}
  ${METPLUS_OBS_CLIMO_MEAN_DICT}
  ${METPLUS_OBS_CLIMO_STDEV_DICT}
}

////////////////////////////////////////////////////////////////////////////////

//
// Climatology mean data
//
//climo_mean = {
${METPLUS_CLIMO_MEAN_DICT}


//climo_stdev = {
${METPLUS_CLIMO_STDEV_DICT}

//
// May be set separately in each "obs.field" entry
//
//climo_cdf = {
${METPLUS_CLIMO_CDF_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Verification masking regions
//
// mask = {
${METPLUS_MASK_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Confidence interval settings
//
ci_alpha  = [ 0.05 ];

boot = {
   interval = PCTILE;
   rep_prop = 1.0;
   n_rep    = 0;
   rng      = "mt19937";
   seed     = "";
}

////////////////////////////////////////////////////////////////////////////////

//
// Data smoothing methods
//
//interp = {
${METPLUS_INTERP_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Neighborhood methods
//
nbrhd = {
   field      = BOTH;
   // shape =
   ${METPLUS_NBRHD_SHAPE}
   // width =
   ${METPLUS_NBRHD_WIDTH}
   // cov_thresh =
   ${METPLUS_NBRHD_COV_THRESH}
   vld_thresh = 1.0;
}

////////////////////////////////////////////////////////////////////////////////

//
// Fourier decomposition
// May be set separately in each "obs.field" entry
//
//fourier = {
${METPLUS_FOURIER_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Gradient statistics
// May be set separately in each "obs.field" entry
//
//gradient = {
${METPLUS_GRADIENT_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Distance Map statistics
// May be set separately in each "obs.field" entry
//
//distance_map = {
${METPLUS_DISTANCE_MAP_DICT}


////////////////////////////////////////////////////////////////////////////////
// Threshold for SEEPS p1 (Probability of being dry)

//seeps_p1_thresh =
${METPLUS_SEEPS_P1_THRESH}

////////////////////////////////////////////////////////////////////////////////

//
// Statistical output types
//
//output_flag = {
${METPLUS_OUTPUT_FLAG_DICT}

//
// NetCDF matched pairs output file
// May be set separately in each "obs.field" entry
//
// nc_pairs_flag = {
${METPLUS_NC_PAIRS_FLAG_DICT}

////////////////////////////////////////////////////////////////////////////////

//ugrid_dataset =
${METPLUS_UGRID_DATASET}

//ugrid_max_distance_km =
${METPLUS_UGRID_MAX_DISTANCE_KM}

//ugrid_coordinates_file =
${METPLUS_UGRID_COORDINATES_FILE}

////////////////////////////////////////////////////////////////////////////////

//grid_weight_flag =
${METPLUS_GRID_WEIGHT_FLAG}

tmp_dir = "${MET_TMP_DIR}";

// output_prefix =
${METPLUS_OUTPUT_PREFIX}

////////////////////////////////////////////////////////////////////////////////

${METPLUS_TIME_OFFSET_WARNING}
${METPLUS_MET_CONFIG_OVERRIDES}

Python Embedding

This use case calls a Python Embedding script to read Zarr format data into Grid-Stat. The script takes 5 inputs, the directory where the Zarr files are located, model initialization time, lead time, variable to be read in, and the level. The script reads in the Zarr format data, selects the desired variable, and sets up the grid attributes.

parm/use_cases/model_applications/short_range/GridStat_fcstHRRRCast_obsHRRRanal_zarr/read_zarr_HRRRCast.py

import xarray as xr
import datetime as dt
import sys

# Check and get input arguments
if len(sys.argv) != 6:
    print("ERROR: Must supply input file, init time, lead time, variable, and level to script")
    sys.exit(1)

input_file = sys.argv[1]
init_time_in = sys.argv[2]
lead_time_in = sys.argv[3]
var = sys.argv[4]
varlevel = sys.argv[5]

# Read the zarr file
ds = xr.open_zarr(input_file)

# Get/Calculate init, valid, and lead time
init_time = dt.datetime.strptime(init_time_in,'%Y%m%d_%H%M%S')
lead_time = dt.timedelta(hours=float(lead_time_in))
valid_time_dt = init_time + lead_time

# Select the time and variable from the data
try:
    ds_time = ds.sel(time=init_time)
    ds_time_lt = ds_time.sel(lead_time=lead_time)
    met_data_var = ds_time_lt[var]
except:
    print('Error: Init Time '+init_time_in+', lead time '+lead_time_in+', or variable '+var+' not present in zarr file.')
    print('Please select a variable or time that is in the file.')
    sys.exit('Exiting')

# Get level if needed
if varlevel[0] == 'P':
    levnum = varlevel[1:]
    # Check to make sure level is a dimension in our variable array
    if 'level' in met_data_var.dims:
        level_in_array = (met_data_var.level == float(levnum)).any().item()
        if level_in_array:
            met_data_var_lvl = met_data_var.sel(level=levnum)
        else:
            print('Error: Level '+str(levnum)+' not found in array')
            sys.exit('Exiting')
else:
    met_data_var_lvl = met_data_var

# Set up MET data
latsize = int(met_data_var.sizes['latitude'])
lonsize = int(met_data_var.sizes['longitude'])
met_data = met_data_var_lvl.values
met_data = met_data[::-1]

# Get up some units and variable names
if var == 'T2M' or var == 'TMP':
    varunits = 'K'
    var_lonname = 'Temperature'
elif var == 'REFC':
    varunits = 'dBZ'
    var_lonname = 'Reflectivity'
elif var == 'HGT':
    varunits = 'gpm'
    var_lonname = 'Height'
elif var == 'UGRD' or var == 'VGRD':
    varunits = 'm/s'
    var_lonname = var[0]+' Wind'
elif var == 'SPFH':
    varunits = 'kg/kg'
    var_lonname = 'Specific Humidity'
elif var == 'VVEL':
    varunits = 'Pa/s'
    var_lonname = 'Vertical Velocity'

# Set up MET attributes
attrs = {
   'valid': valid_time_dt.strftime('%Y%m%d_%H%M%S'),
   'init':  init_time_in,
   'lead':  lead_time_in+'0000',
   'accum': '00',

   'name':      var,
   'long_name': var_lonname,
   'level':     varlevel,
   'units':     varunits,

   'grid': {
       'type': 'Lambert Conformal',
       'hemisphere': 'N',
       'name': var,
       'nx':lonsize,
       'ny':latsize,
       'lat_pin': 38.5,
       'lon_pin': 262.5,
       'x_pin': float(lonsize)/2.0,
       'y_pin': float(latsize)/2.0,
       'lon_orient': 262.5,
       'd_km': 6.0,
       'r_km': 6371.229,
       'scale_lat_1': 38.5,
       'scale_lat_2': 38.5
    }
}

print(attrs)

For more information on the basic requirements to utilize Python Embedding in METplus, please refer to the MET User’s Guide section on Python embedding.

User Scripting

This use case does not use additional scripts.

Running METplus

Pass the use case configuration file to the run_metplus.py script along with any user-specific system configuration files if desired:

run_metplus.py /path/to/METplus/parm/use_cases/model_applications/short_range/GridStat_fcstHRRRCast_obsHRRRanal_zarr.conf /path/to/user_system.conf

See Running METplus for more information.

Expected Output

A successful run will output the following both to the screen and to the logfile:

INFO: METplus has successfully finished running.

Refer to the value set for OUTPUT_BASE to find where the output data was generated. Output for this use case can be found in {OUTPUT_BASE}/model_applications/short_range/GridStat_fcstHRRRCast_obsHRRRanal_zarr/grid_stat and will contain 2 directories, one for each model initialization:

* 2024050200
* 2024050201

Inside each directory, there will be 12 .stat files of the format grid_stat_HRRRCast_vs_HRRR_HHMMSSL_YYYYMMDD_HHMMSSV.stat, where HHMMSSL is the hour, minute, and second of the lead time, YYYYMMDD is the year, month, and day of the valid time, and HHMMSSV is the hour, minute, and second of the valid time. The 2024050200 directory contains the following files:

* grid_stat_HRRRCast_vs_HRRR_010000L_20240502_010000V.stat
* grid_stat_HRRRCast_vs_HRRR_020000L_20240502_020000V.stat
* grid_stat_HRRRCast_vs_HRRR_030000L_20240502_030000V.stat
* grid_stat_HRRRCast_vs_HRRR_040000L_20240502_040000V.stat
* grid_stat_HRRRCast_vs_HRRR_050000L_20240502_050000V.stat
* grid_stat_HRRRCast_vs_HRRR_060000L_20240502_060000V.stat
* grid_stat_HRRRCast_vs_HRRR_070000L_20240502_070000V.stat
* grid_stat_HRRRCast_vs_HRRR_080000L_20240502_080000V.stat
* grid_stat_HRRRCast_vs_HRRR_090000L_20240502_090000V.stat
* grid_stat_HRRRCast_vs_HRRR_120000L_20240502_120000V.stat
* grid_stat_HRRRCast_vs_HRRR_150000L_20240502_150000V.stat
* grid_stat_HRRRCast_vs_HRRR_180000L_20240502_180000V.stat

The 2024050201 directory contains the following files:

* grid_stat_HRRRCast_vs_HRRR_010000L_20240502_020000V.stat
* grid_stat_HRRRCast_vs_HRRR_020000L_20240502_030000V.stat
* grid_stat_HRRRCast_vs_HRRR_030000L_20240502_040000V.stat
* grid_stat_HRRRCast_vs_HRRR_040000L_20240502_050000V.stat
* grid_stat_HRRRCast_vs_HRRR_050000L_20240502_060000V.stat
* grid_stat_HRRRCast_vs_HRRR_060000L_20240502_070000V.stat
* grid_stat_HRRRCast_vs_HRRR_070000L_20240502_080000V.stat
* grid_stat_HRRRCast_vs_HRRR_080000L_20240502_090000V.stat
* grid_stat_HRRRCast_vs_HRRR_090000L_20240502_100000V.stat
* grid_stat_HRRRCast_vs_HRRR_120000L_20240502_130000V.stat
* grid_stat_HRRRCast_vs_HRRR_150000L_20240502_160000V.stat
* grid_stat_HRRRCast_vs_HRRR_180000L_20240502_190000V.stat

Keywords

Note

  • GridStatToolUseCase

  • PythonEmbeddingFileUseCase

  • ZarrFileUseCase

  • GRIB2FileUseCase

  • ShortRangeAppUseCase

  • AIUseCase

Navigate to the METplus Quick Search for Use Cases page to discover other similar use cases.

sphinx_gallery_thumbnail_path = ‘_static/short_range-GridStat_fcstHRRRCast_obsHRRRanal_zarr.png’

Gallery generated by Sphinx-Gallery