Point-Stat: Standard Verification for CONUS Surface

model_applications/tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD _NetCDF.conf

Scientific Objective

To provide useful statistical information on the relationship between observation data in point format to a gridded forecast. These values can be used to assess the skill of the prediction. Statistics are store as partial sums to save space and Stat-Analysis must be used to compute Continuous statistics.

Datasets

Forecast: HAFS temperature
Observation: HRD Dropsonde data
Location of Model forecast and Dropsonde files: All of the input data required for this use case can be found in the sample data tarball. Click here to download.
This tarball should be unpacked into the directory that you will set the value of INPUT_BASE. See Running METplus section for more information.

METplus Components

This use case utilizes the METplus ASCII2NC wrapper to convert full-resolution data (frd) dopsonde point observations to NetCDF format and then compare them to gridded forecast data using PointStat.

METplus Workflow

The use case runs the UserScript wrapper (untar the dropsonde file and extract the files to a directory), ASCII2NC (convert the ascii files to NetCDF format), and PointStat (compute statistics against HAFS model output), which are the tools called in this example. It processes the following run times:

Valid: 2019-08-29 12Z

METplus Configuration

METplus first loads all of the configuration files found in parm/metplus_config, then it loads any configuration files passed to METplus via the command line with the -c option, i.e. -c parm/use_cases/model_applications/tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF.conf

[config]

# Documentation for this use case can be found at
# https://metplus.readthedocs.io/en/latest/generated/model_applications/tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF.html

# For additional information, please see the METplus Users Guide.
# https://metplus.readthedocs.io/en/latest/Users_Guide

###
# Processes to run
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#process-list
###

PROCESS_LIST = UserScript(untar_drop_file), Ascii2nc, PointStat


###
# Time Info
# LOOP_BY options are INIT, VALID, RETRO, and REALTIME
# If set to INIT or RETRO:
#   INIT_TIME_FMT, INIT_BEG, INIT_END, and INIT_INCREMENT must also be set
# If set to VALID or REALTIME:
#   VALID_TIME_FMT, VALID_BEG, VALID_END, and VALID_INCREMENT must also be set
# LEAD_SEQ is the list of forecast leads to process
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#timing-control
###

LOOP_BY = VALID
VALID_TIME_FMT = %Y%m%d%H
VALID_BEG = 2019082912
VALID_END = 2019082912
VALID_INCREMENT = 21600

LEAD_SEQ = 0,6,12,18

USER_SCRIPT_RUNTIME_FREQ = RUN_ONCE_PER_INIT_OR_VALID


###
# File I/O
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#directory-and-filename-template-info
###

# UserScript

USER_SCRIPT_INPUT_DIR = {INPUT_BASE}/model_applications/tc_and_extra_tc/dropsonde/obs
USER_SCRIPT_OUTPUT_DIR = {OUTPUT_BASE}/model_applications/tc_and_extra_tc/dropsonde/obs


# ASCII2NC

ASCII2NC_INPUT_TEMPLATE = "{PARM_BASE}/use_cases/model_applications/tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF/hrd_frd_sonde_for_ascii2nc.py {USER_SCRIPT_OUTPUT_DIR}/{valid?fmt=%Y%m%d}"

ASCII2NC_OUTPUT_DIR = {OUTPUT_BASE}/model_applications/tc_and_extra_tc/dropsonde/ascii2nc
ASCII2NC_OUTPUT_TEMPLATE = drop{valid?fmt=%Y%m%d}.nc

# PointStat

FCST_POINT_STAT_INPUT_DIR = {INPUT_BASE}/model_applications/tc_and_extra_tc/dropsonde
FCST_POINT_STAT_INPUT_TEMPLATE = hafs.{valid?fmt=%Y%m%d%H}/dorian05l.{init?fmt=%Y%m%d%H}.hafsprs.synoptic.TMP600-900.0p03.f{lead?fmt=%3H}.grb2

OBS_POINT_STAT_INPUT_DIR = {OUTPUT_BASE}/model_applications/tc_and_extra_tc/dropsonde/ascii2nc
OBS_POINT_STAT_INPUT_TEMPLATE = {ASCII2NC_OUTPUT_TEMPLATE}

POINT_STAT_OUTPUT_DIR = {OUTPUT_BASE}/{OBTYPE}


###
# Field Info
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#field-info
###

MODEL = HAFS
OBTYPE = drop

BOTH_VAR1_NAME = TMP
BOTH_VAR1_LEVELS = P925-950, P850-800, P700-650

OBS_WINDOW_BEGIN = -5400
OBS_WINDOW_END = 5400


###
# UserScript Settings
# https://metplus.readthedocs.io/en/latest/Users_Guide/wrappers.html#userscript
###

USER_SCRIPT_ARGUMENTS = {USER_SCRIPT_INPUT_DIR} {valid?fmt=%Y%m%d} {USER_SCRIPT_OUTPUT_DIR}
USER_SCRIPT_COMMAND = {PARM_BASE}/use_cases/model_applications/tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF/hrd_frd_sonde_find_tar.py {USER_SCRIPT_ARGUMENTS}


###
# ASCII2NC Settings
# https://metplus.readthedocs.io/en/latest/Users_Guide/wrappers.html#ascii2nc
###

ASCII2NC_INPUT_FORMAT = python
ASCII2NC_TIME_SUMMARY_FLAG = False
ASCII2NC_TIME_SUMMARY_RAW_DATA = False
ASCII2NC_TIME_SUMMARY_BEG = 000000
ASCII2NC_TIME_SUMMARY_END = 235959
ASCII2NC_TIME_SUMMARY_STEP = 300
ASCII2NC_TIME_SUMMARY_WIDTH = 600
ASCII2NC_TIME_SUMMARY_GRIB_CODES = 11, 204, 211
ASCII2NC_TIME_SUMMARY_VAR_NAMES =
ASCII2NC_TIME_SUMMARY_TYPES = min, max, range, mean, stdev, median, p80
ASCII2NC_TIME_SUMMARY_VALID_FREQ = 0
ASCII2NC_TIME_SUMMARY_VALID_THRESH = 0.0


###
# PointStat Settings
# https://metplus.readthedocs.io/en/latest/Users_Guide/wrappers.html#pointstat
###

POINT_STAT_MESSAGE_TYPE = ADPUPA

POINT_STAT_GRID = FULL

POINT_STAT_CLIMO_MEAN_TIME_INTERP_METHOD = NEAREST

POINT_STAT_INTERP_TYPE_METHOD = BILIN
POINT_STAT_INTERP_TYPE_WIDTH = 2

POINT_STAT_OUTPUT_FLAG_SL1L2 = STAT
POINT_STAT_OUTPUT_FLAG_VL1L2 = STAT
POINT_STAT_OUTPUT_FLAG_FHO = BOTH
POINT_STAT_OUTPUT_FLAG_CTC = BOTH
POINT_STAT_OUTPUT_FLAG_CTS = STAT
POINT_STAT_OUTPUT_FLAG_CNT = BOTH
POINT_STAT_OUTPUT_FLAG_ECLV = BOTH
POINT_STAT_OUTPUT_FLAG_MPR = BOTH

POINT_STAT_REGRID_TO_GRID = NONE

Notes for USER_SCRIPT* METplus conf items for this use case:

  • ${USER_SCRIPT_RUNTIME_FREQ} - Corresponds to USER_SCRIPT_RUNTIME_FREQ in the METplus configuration file.

  • ${USER_SCRIPT_INPUT_DIR} - Corresponds to USER_SCRIPT_INPUT_DIR in the METplus configuration file.

  • ${USER_SCRIPT_OUTPUT_DIR} - Corresponds to USER_SCRIPT_OUTPUT_DIR in the METplus configuration file.

  • ${USER_SCRIPT_COMMAND} - Arguments needed to hrd_frd_sonde_find_tar.py corresponds to USER_SCRIPT_INPUT_TEMPLATE.

  • ${USER_SCRIPT_INPUT_TEMPLATE} - Input template to hrd_frd_sonde_find_tar.py: USER_SCRIPT_INPUT_DIR, valid date (%Y%m%d), and USER_SCRIPT_OUTPUT_DIR.

MET Configuration

METplus sets environment variables based on user settings in the METplus configuration file. See How METplus controls MET config file settings for more details.

YOU SHOULD NOT SET ANY OF THESE ENVIRONMENT VARIABLES YOURSELF! THEY WILL BE OVERWRITTEN BY METPLUS WHEN IT CALLS THE MET TOOLS!

If there is a setting in the MET configuration file that is currently not supported by METplus you’d like to control, please refer to: Overriding Unsupported MET config file settings

Ascii2NcConfig_wrapped

Note

See the ASCII2NC MET Configuration section of the User’s Guide for more information on the environment variables used in the file below:

////////////////////////////////////////////////////////////////////////////////
//
// Default ascii2nc configuration file
//
////////////////////////////////////////////////////////////////////////////////

//
// The parameters listed below are used to summarize the ASCII data read in
//

//
// Time periods for the summarization
// obs_var (string array) is added and works like grib_code (int array)
// when the obs name is given instead of grib_code
//
${METPLUS_TIME_SUMMARY_DICT}

//
// Mapping of input little_r report types to output message types
//
message_type_map = [
   { key = "FM-12 SYNOP";  val = "ADPSFC"; },
   { key = "FM-13 SHIP";   val = "SFCSHP"; },
   { key = "FM-15 METAR";  val = "ADPSFC"; },
   { key = "FM-18 BUOY";   val = "SFCSHP"; },
   { key = "FM-281 QSCAT"; val = "ASCATW"; },
   { key = "FM-32 PILOT";  val = "ADPUPA"; },
   { key = "FM-35 TEMP";   val = "ADPUPA"; },
   { key = "FM-88 SATOB";  val = "SATWND"; },
   { key = "FM-97 ACARS";  val = "AIRCFT"; }
];

//
// Indicate a version number for the contents of this configuration file.
// The value should generally not be modified.
//
//version = "V10.0";

tmp_dir = "${MET_TMP_DIR}";

${METPLUS_MET_CONFIG_OVERRIDES}

PointStatConfig_wrapped

Note

See the PointStat MET Configuration section of the User’s Guide for more information on the environment variables used in the file below:

////////////////////////////////////////////////////////////////////////////////
//
// Point-Stat configuration file.
//
// For additional information, see the MET_BASE/config/README file.
//
////////////////////////////////////////////////////////////////////////////////

//
// Output model name to be written
//
// model =
${METPLUS_MODEL}

//
// Output description to be written
// May be set separately in each "obs.field" entry
//
// desc =
${METPLUS_DESC}

////////////////////////////////////////////////////////////////////////////////

//
// Verification grid
//
// regrid = {
${METPLUS_REGRID_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// May be set separately in each "field" entry
//
censor_thresh = [];
censor_val    = [];
cat_thresh    = [ NA ];
cnt_thresh    = [ NA ];
cnt_logic     = UNION;
wind_thresh   = [ NA ];
wind_logic    = UNION;
eclv_points   = 0.05;
//hss_ec_value =
${METPLUS_HSS_EC_VALUE}
rank_corr_flag = FALSE;

//
// Forecast and observation fields to be verified
//
fcst = {
  ${METPLUS_FCST_FILE_TYPE}
  ${METPLUS_FCST_FIELD}
}

obs = {
  ${METPLUS_OBS_FILE_TYPE}
  ${METPLUS_OBS_FIELD}
}
////////////////////////////////////////////////////////////////////////////////

//
// Point observation filtering options
// May be set separately in each "obs.field" entry
//
// message_type =
${METPLUS_MESSAGE_TYPE}
sid_exc        = [];

//obs_quality_inc =
${METPLUS_OBS_QUALITY_INC}

//obs_quality_exc =
${METPLUS_OBS_QUALITY_EXC}

duplicate_flag = NONE;
obs_summary    = NONE;
obs_perc_value = 50;

//
// Mapping of message type group name to comma-separated list of values.
//
//message_type_group_map =
${METPLUS_MESSAGE_TYPE_GROUP_MAP}

////////////////////////////////////////////////////////////////////////////////

//
// Climatology data
//
//climo_mean = {
${METPLUS_CLIMO_MEAN_DICT}


//climo_stdev = {
${METPLUS_CLIMO_STDEV_DICT}

//
// May be set separately in each "obs.field" entry
//
//climo_cdf = {
${METPLUS_CLIMO_CDF_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Point observation time window
//
// obs_window = {
${METPLUS_OBS_WINDOW_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Verification masking regions
//
//mask = {
${METPLUS_MASK_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Confidence interval settings
//
ci_alpha  = [ 0.05 ];

boot = {
   interval = PCTILE;
   rep_prop = 1.0;
   n_rep    = 0;
   rng      = "mt19937";
   seed     = "";
}

////////////////////////////////////////////////////////////////////////////////

//
// Interpolation methods
//
//interp = {
${METPLUS_INTERP_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// HiRA verification method
//
//hira = {
${METPLUS_HIRA_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Statistical output types
//
//output_flag = {
${METPLUS_OUTPUT_FLAG_DICT}

////////////////////////////////////////////////////////////////////////////////
// Threshold for SEEPS p1 (Probability of being dry)

//seeps_p1_thresh =
${METPLUS_SEEPS_P1_THRESH}

////////////////////////////////////////////////////////////////////////////////

tmp_dir = "${MET_TMP_DIR}";

// output_prefix =
${METPLUS_OUTPUT_PREFIX}
//version        = "V10.0.0";

////////////////////////////////////////////////////////////////////////////////

${METPLUS_MET_CONFIG_OVERRIDES}

Python Embedding

This use case uses two Python embedding scripts: one to download the data (hrd_frd_sonde_find_tar.py) and the other to process it (hrd_frd_sonde_for_ascii2nc.py).

parm/use_cases/model_applications/tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF/hrd_frd_sonde_find_tar.py

#! /usr/bin/env python3
#####################################################################
#  This script will untar the FRD formatted dropsonde tar files from 
#  https://www.aoml.noaa.gov/hrd/data_sub/dropsonde.htmli
#  The untarred files will be downloaded in to a direcory
#  under USER_SCRIPT_OUTPUT_DIR. Arguments to the scripts includes
#  directory where the tar files exists, the user specified 
#  date in YYYYMMDD, and output directory
#  Author: biswas@ucar.edu
#####################################################################

import sys
import os
import glob
import tarfile

if len(sys.argv) == 4:
  path = sys.argv[1]
  date = sys.argv[2]
  outdir = sys.argv[3]

  if os.path.exists(path):
     print("Directory exists: "+ path)

     for name in glob.glob(path+'/'+str(date)+'*FRD.tar.gz'):
       print (name)

       drop_tar = tarfile.open(name)
       drop_tar.extractall(outdir + '/'+str(date))
       drop_files = os.listdir(outdir + '/'+str(date))
       print(drop_files)
       drop_tar.close()

  else: 
     print("Directory not present" + path)

else:
  print("ERROR : Must specify exactly one input data directory, date (YYYYMMDD), and output directory.")
  sys.exit(1)
  
####################################################################

parm/use_cases/model_applications/tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF/hrd_frd_sonde_for_ascii2nc.py

########################################################################
#
# Description:
#   Prepare HRD FRD (full-resolution data) dropsonde files for further
#   processing by the ascii2nc tool in MET.
#   Source: https://www.aoml.noaa.gov/hrd/data_sub/dropsonde.html
#
# Date:
#   December 2020
#
########################################################################

import re
import os
import sys
import numpy as np
import itertools
import datetime as dt
from datetime import datetime, timedelta
import pandas as pd

# Check arguments
if len(sys.argv) == 2:
  input_dir = os.path.expandvars(sys.argv[1])
  print("Input Dir:\t" + repr(input_dir))
else:
  print("ERROR:", sys.argv[0],
        "-> Must specify exactly one input file.")
  sys.exit(1)

# Empty object
my_data = pd.DataFrame()

for filename in sorted(os.listdir(input_dir)):
   input_file = os.path.join(input_dir, filename)

   # Open file
   with open(input_file, 'r') as file_handle:
      lines = file_handle.read().splitlines()
   readdata = False
   for idx, line in enumerate(lines):

    # Extract date, time and sonde info
       match_date = re.match(r'^ Date:(.*)', line)
       match_time = re.match(r'^ Time:(.*)', line)
       match_sonde = re.match(r'^ SID:(.*)', line)

       if match_date:
         date_items = match_date.group(1).split()[:1]
         lat = match_date.group(1).split()[:4]
       if match_time:
         time_items = match_time.group(1).split()[:1]
         lon = match_time.group(1).split()[:4]
       if match_sonde:
         sonde = match_sonde.group(1).split()[0]

         # Format the date and time
         date_formatted = \
           f"{date_items[0][:2]}{date_items[0][2:4]}{date_items[0][4:6]}_" +\
           f"{time_items[0][:2]}:{time_items[0][2:4]}:{time_items[0][4:6]}"
         valid_time = \
           dt.datetime.strptime(date_formatted, "%y%m%d_%H:%M:%S")
         print(f"Valid Time:\t{valid_time}")
       if line.startswith("IX"):
           readdata = True
           continue
       if not readdata:
           continue
       line    = line.strip()
       columns = line.split()
       dsec    = str(columns[1])     # time elasp (s)
       pres    = float(columns[2])   # pressure (mb)
       temp    = float(columns[3])   # temperature (C)
       temp    = temp + 273.15        # convert deg C to K
       relh    = float(columns[4])   # relative humidity (%)
       geop    = int(columns[5])     # geopotential mass height (m)
       wind_dir = int(columns[6])    # wind direction (E)
       wind_spd = float(columns[7])  # wind speed (m/s)
       wind_z   = float(columns[8])  # zonal wind (m/s)
       wind_m   = float(columns[9])  # meridional wind (m/s)
       wind_w   = float(columns[11]) # vertical velocity (m/s)
       zw       = int(columns[12])   # geopotential wind height (m)
       lat      = float(columns[17]) # lat (N)
       lon      = float(columns[18]) # lon (E)
       vld      = valid_time + dt.timedelta(seconds=float(dsec))

       # Skip line if dsec, lat, or lon are missing.
       # Or if pres and geop are missing.
       if dsec == -999.0 or lat == -999.0 or lon == -999.0 or +\
           (pres == -999.0 and geop == -999):
          continue

       # Store valid time in YYYYMMDD_HHMMSS format
       t_vld = vld.strftime('%Y%m%d_%H%M%S')

       # Flag values for the station elevation and qc
       elv = "-9999"
       qc  = "-9999"

       # Append observations for this line
       # Name variable using GRIB conventions:
       #   https://www.nco.ncep.noaa.gov/pmb/docs/on388/table2.html
       if temp != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "TMP", pres, geop, qc, temp]])))

       if relh != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "RH", pres, geop, qc, relh]])))

       if geop != -999.0 and pres != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "HGT", pres, geop, qc, geop]])))

       if wind_dir != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "WDIR", pres, zw, qc, wind_dir]])))

       if wind_spd != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "WIND", pres, zw, qc, wind_spd]])))

       if wind_z != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "UGRD", pres, zw, qc, wind_z]])))

       if wind_m != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "VGRD", pres, zw, qc, wind_m]])))

       if wind_w != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "DZDT", pres, zw, qc, wind_w]])))

# Prepare point_data object for ascii2nc
point_data = my_data.values.tolist()
print("Data Length:\t" + repr(len(point_data)))
print("Data Type:\t" + repr(type(point_data)))

Running METplus

This use case can be run two ways:

  1. Passing in UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF.conf then a user-specific system configuration file:

    run_metplus.py -c /path/to/METplus/parm/use_cases/model_applications//tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF.conf -c /path/to/user_system.conf
    
  2. Modifying the configurations in parm/metplus_config, then passing in UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF.conf:

    run_metplus.py -c /path/to/METplus/parm/use_cases/model_applications/tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF.conf

The former method is recommended. Whether you add them to a user-specific configuration file or modify the metplus_config files, the following variables must be set correctly:

  • INPUT_BASE - Path to directory where sample data tarballs are unpacked (See Datasets section to obtain tarballs). This is not required to run METplus, but it is required to run the examples in parm/use_cases

  • OUTPUT_BASE - Path where METplus output will be written. This must be in a location where you have write permissions

  • MET_INSTALL_DIR - Path to location where MET is installed locally

Example User Configuration File:

[dir]
INPUT_BASE = /path/to/sample/input/data
OUTPUT_BASE = /path/to/output/dir
MET_INSTALL_DIR = /path/to/met-X.Y

NOTE: All of these items must be found under the [dir] section.

Expected Output

A successful run will output the following both to the screen and to the logfile:

INFO: METplus has successfully finished running.

Refer to the value set for OUTPUT_BASE to find where the output data was generated. Output for this use case will be found in nam (relative to OUTPUT_BASE) and will contain the following files:

  • point_stat_180000L_20190829_120000V.stat

  • point_stat_180000L_20190829_120000V_fho.txt

  • point_stat_180000L_20190829_120000V_eclv.txt

  • point_stat_180000L_20190829_120000V_ctc.txt

  • point_stat_180000L_20190829_120000V_cnt.txt

  • point_stat_180000L_20190829_120000V_mpr.txt

Keywords

Note

  • TCandExtraTCAppUseCase

  • UserScriptUseCase

  • PointStatToolUseCase

  • ASCII2NCToolUseCase

  • TropicalCycloneUseCase

Navigate to the METplus Quick Search for Use Cases page to discover other similar use cases.

sphinx_gallery_thumbnail_path = ‘_static/tc_and_extra_tc-UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF.png’

Total running time of the script: (0 minutes 0.000 seconds)

Gallery generated by Sphinx-Gallery