Grid-Stat: Verification of TC forecasts against merged TDR data

model_applications/tc_and_extra_tc/GridStat_fcstHAFS_obsTDR _NetCDF.conf

Scientific Objective

To provide useful statistical information on the relationship between merged Tail Doppler Radar (TDR) data in NetCDF format to a gridded forecast. These values can be used to assess the skill of the prediction. The TDR data is available every 0.5 km AGL. So, the TC forecasts need to be in height coordinates to compare with the TDR data.

Datasets

Forecast: HAFS zonal wind
Observation: HRD TDR merged_zonal_wind
Location of Model forecast and Dropsonde files: All of the input data required for this use case can be found in the sample data tarball. Click here to download.
This tarball should be unpacked into the directory that you will set the value of INPUT_BASE. See ‘Running METplus’ section for more information.
TDR Data Source: Hurricane Research Division: Contact: Paul Reasor Email: paul.reasor@noaa.gov
The data dataset used in the use case is a subset of the Merged Analysis (v2d_combined_xy_rel_merged_ships.nc).
Thanks to HRD for providing us the dataset

METplus Components

The observations in the use case contains data mapped into Cartesian Grids with a horizontal grid spacing of 2 km and vertical grid spacing of 0.5 km. Hence the model output needs to be in height (km) (vertical coordinates) instead of pressure levels. Both observation and model output are available with the release. The instructions below tells how the input to the use case was prepared. The Hurricane Analysis and Forecast System (HAFS) (pressure levels in GRIB2 format) outputs are converted to height level (in NetCDF4 format) using METcalcpy vertical interpolation routine. Under METcalcpy/examples directory user can modify the vertical_interp_hwrf.sh or create a similar file for their own output. The $DATA_DIR is the top level output directory where the pressure level data resides. The –input and –output should point to the input and output file names resp. The –config points to a yaml file. Users should edit the yaml file, if needed. For this use case only zonal wind (u) at 4 (200m, 2000m, 4000m and 6000m) vertical levels are provided. The use case will compare the HAFS 2 km zonal wind (u) data against TDR’s merged_zonal_wind at 2km. The user need to run the shell script to get the height level output in NetCDF4 format. This use case utilizes the METplus python embedding to read the TDR data and compare them to gridded forecast data using GridStat.

METplus Workflow

The use case runs the python embedding scripts (GridStat_fcstHAFS_obsTDR_NetCDF/read_tdr.py: to read the TDR data) and run Grid-Stat (compute statistics against HAFS model output, in height coordinates), called in this example.

It processes the following run times: Valid at 2019-08-29 12Z

Forecast lead times: 0,6,12 and 18 UTC

The mission number (e.g CUSTOM_LOOP_LIST = 190829H1)

Height level (for TDR: OBS_VERT_LEVEL_KM = 2, HAFS: FCST_VAR1_LEVELS = “(0,1,*,*)”)

METplus Configuration

METplus first loads all of the configuration files found in parm/metplus_config, then it loads any configuration files passed to METplus via the command line with the -c option, i.e. -c parm/use_cases/model_applications/tc_and_extra_tc/GridStat_fcstHAFS_obsTDR_NetCDF.conf

[config]

# Documentation for this use case can be found at
# https://metplus.readthedocs.io/en/latest/generated/model_applications/tc_and_extra_tc/GridStat_fcstHAFS_obsTDR_NetCDF.html

# For additional information, please see the METplus Users Guide.
# https://metplus.readthedocs.io/en/latest/Users_Guide

###
# Processes to run
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#process-list
###

PROCESS_LIST = GridStat


###
# Time Info
# LOOP_BY options are INIT, VALID, RETRO, and REALTIME
# If set to INIT or RETRO:
#   INIT_TIME_FMT, INIT_BEG, INIT_END, and INIT_INCREMENT must also be set
# If set to VALID or REALTIME:
#   VALID_TIME_FMT, VALID_BEG, VALID_END, and VALID_INCREMENT must also be set
# LEAD_SEQ is the list of forecast leads to process
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#timing-control
###

LOOP_BY = VALID
VALID_TIME_FMT = %Y%m%d%H
VALID_BEG = 2019082912
VALID_END = 2019082912
VALID_INCREMENT = 21600

LEAD_SEQ = 0,6,12,18

CUSTOM_LOOP_LIST = 190829H1


###
# File I/O
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#directory-and-filename-template-info
###

FCST_GRID_STAT_INPUT_DIR = {INPUT_BASE}/model_applications/tc_and_extra_tc/GridStat_fcstHAFS_obsTDR_NetCDF/hafs_height
FCST_GRID_STAT_INPUT_TEMPLATE = dorian05l.{init?fmt=%Y%m%d%H}.hafsprs.synoptic.0p03.f{lead?fmt=%HHH}.nc4

OBS_GRID_STAT_INPUT_DIR = {INPUT_BASE}/model_applications/tc_and_extra_tc/GridStat_fcstHAFS_obsTDR_NetCDF/obs
OBS_GRID_STAT_INPUT_TEMPLATE = PYTHON_NUMPY

GRID_STAT_OUTPUT_DIR = {OUTPUT_BASE}/model_applications/tc_and_extra_tc/tdr
GRID_STAT_OUTPUT_TEMPLATE = {init?fmt=%Y%m%d%H}


###
# Field Info
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#field-info
###

# Location of the TDR file
TC_RADAR_FILE = {OBS_GRID_STAT_INPUT_DIR}/merged_zonal_wind_tdr.nc

# Obs vertical level in km
OBS_VERT_LEVEL_KM = 2

MODEL = HAFS
OBTYPE = TDR

FCST_VAR1_NAME =  u
FCST_VAR1_LEVELS =  "(0,1,*,*)"
FCST_VAR1_THRESH = gt10.0, gt20.0, lt-10.0, lt-20.0
FCST_VAR1_OPTIONS = set_attr_init="{init?fmt=%Y%m%d_%H%M%S}"; set_attr_valid="{valid?fmt=%Y%m%d_%H%M%S}"; set_attr_lead="{lead?fmt=%H}";
FCST_GRID_STAT_INPUT_DATATYPE = NETCDF_NCCF

OBS_VAR1_NAME = {PARM_BASE}/use_cases/model_applications/tc_and_extra_tc/GridStat_fcstHAFS_obsTDR_NetCDF/read_tdr.py {TC_RADAR_FILE} merged_zonal_wind {custom?fmt=%s} {OBS_VERT_LEVEL_KM}
OBS_VAR1_THRESH = gt10.0, gt20.0, lt-10.0, lt-20.0


###
# GridStat Settings
# https://metplus.readthedocs.io/en/latest/Users_Guide/wrappers.html#gridstat
###

GRID_STAT_OUTPUT_FLAG_FHO = BOTH
GRID_STAT_OUTPUT_FLAG_CTC = STAT
GRID_STAT_OUTPUT_FLAG_CTS = STAT
GRID_STAT_OUTPUT_FLAG_CNT = STAT
GRID_STAT_OUTPUT_FLAG_SL1L2 = STAT
GRID_STAT_OUTPUT_FLAG_ECLV = NONE

GRID_STAT_REGRID_TO_GRID = OBS

GRID_STAT_NEIGHBORHOOD_WIDTH = 1
GRID_STAT_NEIGHBORHOOD_SHAPE = SQUARE
GRID_STAT_NEIGHBORHOOD_COV_THRESH = >=0.5

GRID_STAT_ONCE_PER_FIELD = False

GRID_STAT_OUTPUT_PREFIX = {MODEL}_vs_{OBTYPE}

MET Configuration

METplus sets environment variables based on the values in the METplus configuration file. These variables are referenced in the MET configuration file. YOU SHOULD NOT SET ANY OF THESE ENVIRONMENT VARIABLES YOURSELF! THEY WILL BE OVERWRITTEN BY METPLUS WHEN IT CALLS THE MET TOOLS! If there is a setting in the MET configuration file that is not controlled by an environment variable, you can add additional environment variables to be set only within the METplus environment using the [user_env_vars] section of the METplus configuration files. See the ‘User Defined Config’ section on the ‘System Configuration’ page of the METplus User’s Guide for more information.

////////////////////////////////////////////////////////////////////////////////
//
// Grid-Stat configuration file.
//
// For additional information, see the MET_BASE/config/README file.
//
////////////////////////////////////////////////////////////////////////////////

//
// Output model name to be written
//
// model =
${METPLUS_MODEL}

//
// Output description to be written
// May be set separately in each "obs.field" entry
//
// desc =
${METPLUS_DESC}

//
// Output observation type to be written
//
// obtype =
${METPLUS_OBTYPE}

////////////////////////////////////////////////////////////////////////////////

//
// Verification grid
//
// regrid = {
${METPLUS_REGRID_DICT}

////////////////////////////////////////////////////////////////////////////////

//censor_thresh =
${METPLUS_CENSOR_THRESH}
//censor_val =
${METPLUS_CENSOR_VAL}
cat_thresh  	 = [];
cnt_thresh  	 = [ NA ];
cnt_logic   	 = UNION;
wind_thresh 	 = [ NA ];
wind_logic  	 = UNION;
eclv_points      = 0.05;
//nc_pairs_var_name =
${METPLUS_NC_PAIRS_VAR_NAME}
nc_pairs_var_suffix = "";
//hss_ec_value =
${METPLUS_HSS_EC_VALUE}

rank_corr_flag   = FALSE;

//
// Forecast and observation fields to be verified
//
fcst = {
  ${METPLUS_FCST_FILE_TYPE}
  ${METPLUS_FCST_FIELD}
}
obs = {
  ${METPLUS_OBS_FILE_TYPE}
  ${METPLUS_OBS_FIELD}
}

////////////////////////////////////////////////////////////////////////////////

//
// Climatology mean data
//
//climo_mean = {
${METPLUS_CLIMO_MEAN_DICT}


//climo_stdev = {
${METPLUS_CLIMO_STDEV_DICT}

//
// May be set separately in each "obs.field" entry
//
//climo_cdf = {
${METPLUS_CLIMO_CDF_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Verification masking regions
//
// mask = {
${METPLUS_MASK_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Confidence interval settings
//
ci_alpha  = [ 0.05 ];

boot = {
   interval = PCTILE;
   rep_prop = 1.0;
   n_rep    = 0;
   rng      = "mt19937";
   seed     = "";
}

////////////////////////////////////////////////////////////////////////////////

//
// Data smoothing methods
//
//interp = {
${METPLUS_INTERP_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Neighborhood methods
//
nbrhd = {
   field      = BOTH;
   // shape =
   ${METPLUS_NBRHD_SHAPE}
   // width =
   ${METPLUS_NBRHD_WIDTH}
   // cov_thresh =
   ${METPLUS_NBRHD_COV_THRESH}
   vld_thresh = 1.0;
}

////////////////////////////////////////////////////////////////////////////////

//
// Fourier decomposition
// May be set separately in each "obs.field" entry
//
//fourier = {
${METPLUS_FOURIER_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Gradient statistics
// May be set separately in each "obs.field" entry
//
gradient = {
   dx = [ 1 ];
   dy = [ 1 ];
}

////////////////////////////////////////////////////////////////////////////////

//
// Distance Map statistics
// May be set separately in each "obs.field" entry
//
//distance_map = {
${METPLUS_DISTANCE_MAP_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Statistical output types
//
//output_flag = {
${METPLUS_OUTPUT_FLAG_DICT}

//
// NetCDF matched pairs output file
// May be set separately in each "obs.field" entry
//
// nc_pairs_flag = {
${METPLUS_NC_PAIRS_FLAG_DICT}

////////////////////////////////////////////////////////////////////////////////
// Threshold for SEEPS p1 (Probability of being dry)

//seeps_p1_thresh =
${METPLUS_SEEPS_P1_THRESH}

////////////////////////////////////////////////////////////////////////////////

//grid_weight_flag =
${METPLUS_GRID_WEIGHT_FLAG}

tmp_dir = "${MET_TMP_DIR}";

// output_prefix =
${METPLUS_OUTPUT_PREFIX}

////////////////////////////////////////////////////////////////////////////////

${METPLUS_MET_CONFIG_OVERRIDES}

Note the following variables are referenced in the MET configuration file.

Python Embedding

This use case uses a Python embedding script to read input data

parm/use_cases/model_applications/tc_and_extra_tc/GridStat_fcstHAFS_obsTDR_NetCDF/read_tdr.py

import os
import sys

sys.path.insert(0, os.path.abspath(os.path.dirname(__file__)))

import tdr_utils

if len(sys.argv) < 5:
    print("Must specify exactly one input file, variable name, mission ID (YYMMDDID), level (in km)")
    sys.exit(1)

# Read the input file as the first argument
input_file   = os.path.expandvars(sys.argv[1])
var_name     = sys.argv[2]
mission_name = sys.argv[3]
level_km     = float(sys.argv[4])

met_data, attrs = tdr_utils.main(input_file, var_name, mission_name, level_km)

The above script imports another script called tdr_utils.py in the same directory:

parm/use_cases/model_applications/tc_and_extra_tc/GridStat_fcstHAFS_obsTDR_NetCDF/tdr_utils.py

from netCDF4 import Dataset
import numpy as np
import datetime as dt
import os
import sys
from time import gmtime, strftime

# Return valid time
def get_valid_time(input_file, mission_name):
    f = Dataset(input_file, 'r')
    mid = f.variables['mission_ID'][:].tolist().index(mission_name)
    valid_time = calculate_valid_time(f, mid)
    valid_time_mid = valid_time.strftime("%Y%m%d%H%M") 
    return valid_time_mid

def calculate_valid_time(f, mid):
  merge_year_np  = np.array(f.variables['merge_year'][mid])
  merge_month_np = np.array(f.variables['merge_month'][mid])
  merge_day_np   = np.array(f.variables['merge_day'][mid])
  merge_hour_np  = np.array(f.variables['merge_hour'][mid])
  merge_min_np   = np.array(f.variables['merge_min'][mid])
  valid_time     = dt.datetime(merge_year_np,merge_month_np,merge_day_np,merge_hour_np,merge_min_np,0)
  return valid_time

def read_inputs():
    # Read the input file as the first argument
    input_file   = os.path.expandvars(sys.argv[1])
    var_name     = sys.argv[2]
    mission_name = sys.argv[3]
    level_km     = float(sys.argv[4])
    return input_file, var_name, mission_name, level_km

def main(input_file, var_name, mission_name, level_km):
  ###########################################

  ##
  ##  input file specified on the command line
  ##  load the data into the numpy array
  ##


    try:
      # Print some output to verify that this script ran
      print("Input File:      " + repr(input_file))
      print("Variable Name:   " + repr(var_name))

      # Read input file
      f = Dataset(input_file, 'r')

      # Find the requested mission name 
      mid = f.variables['mission_ID'][:].tolist().index(mission_name)

      # Find the requested level value 
      lid = f.variables['level'][:].tolist().index(level_km)

      # Read the requested variable
      data = np.float64(f.variables[var_name][mid,:,:,lid])

      # Expect that dimensions are ordered (lat, lon)
      # If (lon, lat), transpose the data
      if(f.variables[var_name].dimensions[0] == 'lon'):
         data = data.transpose()

      print("Mission (index): " + repr(mission_name) + " (" + repr(mid) + ")")
      print("Level (index):   " + repr(level_km) + " (" + repr(lid) + ")")
      print("Data Range:      " + repr(np.nanmin(data)) + " to " + repr(np.nanmax(data)))

      # Reset any negative values to missing data (-9999 in MET)
      data[np.isnan(data)] = -9999

      # Flip data along the equator
      data = data[::-1]

      # Store a deep copy of the data for MET
      met_data = data.reshape(200,200).copy()

      print("Data Shape:      " + repr(met_data.shape))
      print("Data Type:       " + repr(met_data.dtype))

    except NameError:
      print("Trouble reading input file: " . input_file)


    ###############################################################################

    # Determine LatLon grid information

    # Read in coordinate data
    merged_lon  = np.array(f.variables['merged_longitudes'][mid,0,:])
    merged_lat  = np.array(f.variables['merged_latitudes'][mid,:,0])

    # Time data:
    valid_time = calculate_valid_time(f, mid)
    init_time = valid_time

    ###########################################

    ##
    ##  create the metadata dictionary
    ##

    ###########################################
    attrs = {
      'valid': valid_time.strftime("%Y%m%d_%H%M%S"),
      'init' : valid_time.strftime("%Y%m%d_%H%M%S"),
      'lead':  '00',
      'accum': '06',
      'mission_id': mission_name,

      'name':      var_name,
      'long_name': var_name,
      'level':     str(level_km) + "km",
      'units':     str(getattr(f.variables[var_name], "units")),

      'grid': {
          'name':       var_name,
          'type' :      'LatLon',
          'lat_ll' :    float(min(merged_lat)),
          'lon_ll' :    float(min(merged_lon)),
          'delta_lat' : float(merged_lat[1]-merged_lat[0]),
          'delta_lon' : float(merged_lon[1]-merged_lon[0]),
          'Nlat' :      len(merged_lat),
          'Nlon' :      len(merged_lon),
      }
    }

    print("Attributes:      " + repr(attrs))
    return met_data, attrs

if __name__ == '__main__':
    if len(sys.argv) < 5:
        print("Must specify exactly one input file, variable name, mission ID (YYMMDDID), level (in km)")
        sys.exit(1)

    input_file, var_name, mission_name, level_km = read_inputs()

    met_data, attrs = main(input_file, var_name, mission_name, level_km)

Running METplus

This use case can be run two ways:

  1. Passing in GridStat_fcstHAFS_obsTDR_NetCDF.conf then a user-specific system configuration file:

    run_metplus.py -c /path/to/METplus/parm/use_cases/model_applications//tc_and_extra_tc/GridStat_fcstHAFS_obsTDR_NetCDF.conf -c /path/to/user_system.conf
    
  2. Modifying the configurations in parm/metplus_config, then passing in GridStat_fcstHAFS_obsTDR_NetCDF.conf:

    run_metplus.py -c /path/to/METplus/parm/use_cases/model_applications/tc_and_extra_tc/GridStat_fcstHAFS_obsTDR_NetCDF.conf
    

The former method is recommended. Whether you add them to a user-specific configuration file or modify the metplus_config files, the following variables must be set correctly:

  • INPUT_BASE - Path to directory where sample data tarballs are unpacked (See Datasets section to obtain tarballs). This is not required to run METplus, but it is required to run the examples in parm/use_cases

  • OUTPUT_BASE - Path where METplus output will be written. This must be in a location where you have write permissions

  • MET_INSTALL_DIR - Path to location where MET is installed locally

Example User Configuration File:

[dir]
INPUT_BASE = /path/to/sample/input/data
OUTPUT_BASE = /path/to/output/dir
MET_INSTALL_DIR = /path/to/met-X.Y

NOTE: All of these items must be found under the [dir] section.

Expected Output

A successful run will output the following both to the screen and to the logfile:

INFO: METplus has successfully finished running.

Refer to the value set for OUTPUT_BASE to find where the output data was generated. Output for this use case will be found in nam (relative to OUTPUT_BASE) and will contain the following files:

  • grid_stat_HAFS_vs_TDR_000000L_20190829_120000V_fho.txt

  • grid_stat_HAFS_vs_TDR_000000L_20190829_120000V_pairs.nc

  • grid_stat_HAFS_vs_TDR_000000L_20190829_120000V.stat

  • The use case is run for 4 lead times valid at 2019081912, so four directories will be generated which contains similar files as above.

Keywords

Note

  • TCandExtraTCAppUseCase

  • GridStatToolUseCase

  • TropicalCycloneUseCase

Navigate to the METplus Quick Search for Use Cases page to discover other similar use cases.

sphinx_gallery_thumbnail_path = ‘_static/tc_and_extra_tc-GridStat_fcstHAFS_obsTDR_NetCDF.png’

Total running time of the script: (0 minutes 0.000 seconds)

Gallery generated by Sphinx-Gallery