5.2.11.4. Point-Stat: Standard Verification for CONUS Surface

model_applications/tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD _NetCDF.conf

Scientific Objective

To provide useful statistical information on the relationship between observation data in point format to a gridded forecast. These values can be used to assess the skill of the prediction. Statistics are store as partial sums to save space and Stat-Analysis must be used to compute Continuous statistics.

Datasets

Forecast: HAFS temperature
Observation: HRD Dropsonde data
Location of Model forecast and Dropsonde files: All of the input data required for this use case can be found in the sample data tarball. Click here to download.
This tarball should be unpacked into the directory that you will set the value of INPUT_BASE. See Running METplus section for more information.

METplus Components

This use case utilizes the METplus ASCII2NC wrapper to convert full-resolution data (frd) dopsonde point observations to NetCDF format and then compare them to gridded forecast data using PointStat.

METplus Workflow

The use case runs the UserScript wrapper (untar the dropsonde file and extract the files to a directory), ASCII2NC (convert the ascii files to NetCDF format), and PointStat (compute statistics against HAFS model output), which are the tools called in this example. It processes the following run times:

Valid: 2019-08-29 12Z

METplus Configuration

METplus first loads all of the configuration files found in parm/metplus_config, then it loads any configuration files passed to METplus via the command line with the -c option, i.e. -c parm/use_cases/model_applications/tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF.conf

[config]

## Configuration-related settings such as the process list, begin and end times, etc.
PROCESS_LIST = UserScript(untar_drop_file), Ascii2nc, PointStat

USER_SCRIPT_RUNTIME_FREQ = RUN_ONCE_PER_INIT_OR_VALID
USER_SCRIPT_INPUT_DIR = {INPUT_BASE}/model_applications/tc_and_extra_tc/dropsonde/obs
USER_SCRIPT_OUTPUT_DIR = {OUTPUT_BASE}/model_applications/tc_and_extra_tc/dropsonde/obs
USER_SCRIPT_COMMAND = {PARM_BASE}/use_cases/model_applications/tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF/hrd_frd_sonde_find_tar.py {USER_SCRIPT_INPUT_TEMPLATE} 

ASCII2NC_INPUT_FORMAT = python
ASCII2NC_TIME_SUMMARY_FLAG = False
ASCII2NC_TIME_SUMMARY_RAW_DATA = False
ASCII2NC_TIME_SUMMARY_BEG = 000000
ASCII2NC_TIME_SUMMARY_END = 235959
ASCII2NC_TIME_SUMMARY_STEP = 300
ASCII2NC_TIME_SUMMARY_WIDTH = 600
ASCII2NC_TIME_SUMMARY_GRIB_CODES = 11, 204, 211
ASCII2NC_TIME_SUMMARY_VAR_NAMES =
ASCII2NC_TIME_SUMMARY_TYPES = min, max, range, mean, stdev, median, p80
ASCII2NC_TIME_SUMMARY_VALID_FREQ = 0
ASCII2NC_TIME_SUMMARY_VALID_THRESH = 0.0

## LOOP_ORDER
## Options are: processes, times
## Looping by time- runs all items in the PROCESS_LIST for each
## initialization time and repeats until all times have been evaluated.
## Looping by processes- run each item in the PROCESS_LIST for all
## specified initialization times then repeat for the next item in the
## PROCESS_LIST.
LOOP_ORDER = processes

LOOP_BY = VALID
VALID_TIME_FMT = %Y%m%d%H
VALID_BEG = 2019082912
VALID_END = 2019082912
VALID_INCREMENT = 21600

LEAD_SEQ = 0,6,12,18


# Logging levels: DEBUG, INFO, WARN, ERROR (most verbose is DEBUG)
LOG_LEVEL = DEBUG

## MET Configuration files for point_stat

# Message types, if all message types are to be returned, leave this empty,
# otherwise indicate the message types of interest.
POINT_STAT_MESSAGE_TYPE = ADPUPA
POINT_STAT_STATION_ID =

# Verification Masking regions
# Indicate which grid and polygon masking region, if applicable
POINT_STAT_GRID = FULL

# List of full path to poly masking files.  NOTE: Only short lists of poly
# files work (those that fit on one line), a long list will result in an
# environment variable that is too long, resulting in an error.  For long
# lists of poly masking files (i.e. all the mask files in the NCEP_mask
# directory), define these in the MET point_stat configuration file.
POINT_STAT_POLY =

# For both pb2nc and point_stat, the obs_window dictionary:
OBS_WINDOW_BEGIN = -5400
OBS_WINDOW_END = 5400

# Model/fcst and obs name, e.g. GFS, NAM, GDAS, etc.
MODEL = HAFS
OBTYPE = drop

# Variables and levels as specified in the field dictionary of the MET
# point_stat configuration file. Specify as FCST_VARn_NAME, FCST_VARn_LEVELS,
# (optional) FCST_VARn_OPTION

BOTH_VAR1_NAME = TMP
BOTH_VAR1_LEVELS = P925-950, P850-800, P700-650

POINT_STAT_CONFIG_FILE ={PARM_BASE}/met_config/PointStatConfig_wrapped

POINT_STAT_CLIMO_MEAN_TIME_INTERP_METHOD = NEAREST

POINT_STAT_INTERP_TYPE_METHOD = BILIN
POINT_STAT_INTERP_TYPE_WIDTH = 2

POINT_STAT_OUTPUT_FLAG_SL1L2 = STAT
POINT_STAT_OUTPUT_FLAG_VL1L2 = STAT
POINT_STAT_OUTPUT_FLAG_FHO = BOTH
POINT_STAT_OUTPUT_FLAG_CTC = BOTH
POINT_STAT_OUTPUT_FLAG_CTS = STAT
POINT_STAT_OUTPUT_FLAG_CNT = BOTH
POINT_STAT_OUTPUT_FLAG_ECLV = BOTH
POINT_STAT_OUTPUT_FLAG_MPR = BOTH

# Regrid to specified grid.  Indicate NONE if no regridding, or the grid id
# (e.g. G212)
POINT_STAT_REGRID_TO_GRID = NONE

LOG_POINT_STAT_VERBOSITY=5

[dir]
TAR_INPUT_DIR = {INPUT_BASE}/model_applications/tc_and_extra_tc/dropsonde/obs
FCST_POINT_STAT_INPUT_DIR = {INPUT_BASE}/model_applications/tc_and_extra_tc/dropsonde
OBS_POINT_STAT_INPUT_DIR = {OUTPUT_BASE}/model_applications/tc_and_extra_tc/dropsonde/ascii2nc
ASCII2NC_OUTPUT_DIR = {OUTPUT_BASE}/model_applications/tc_and_extra_tc/dropsonde/ascii2nc
POINT_STAT_OUTPUT_DIR = {OUTPUT_BASE}/{OBTYPE}

[filename_templates]

USER_SCRIPT_INPUT_TEMPLATE = {USER_SCRIPT_INPUT_DIR} {valid?fmt=%Y%m%d} {USER_SCRIPT_OUTPUT_DIR}
ASCII2NC_OUTPUT_TEMPLATE = drop{valid?fmt=%Y%m%d}.nc
FCST_POINT_STAT_INPUT_TEMPLATE = hafs.{valid?fmt=%Y%m%d%H}/dorian05l.{init?fmt=%Y%m%d%H}.hafsprs.synoptic.TMP600-900.0p03.f{lead?fmt=%3H}.grb2
OBS_POINT_STAT_INPUT_TEMPLATE = {ASCII2NC_OUTPUT_TEMPLATE}
ASCII2NC_INPUT_TEMPLATE = "{PARM_BASE}/use_cases/model_applications/tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF/hrd_frd_sonde_for_ascii2nc.py {USER_SCRIPT_OUTPUT_DIR}/{valid?fmt=%Y%m%d}"

Notes for USER_SCRIPT* METplus conf items for this use case:

  • ${USER_SCRIPT_RUNTIME_FREQ} - Corresponds to USER_SCRIPT_RUNTIME_FREQ in the METplus configuration file.

  • ${USER_SCRIPT_INPUT_DIR} - Corresponds to USER_SCRIPT_INPUT_DIR in the METplus configuration file.

  • ${USER_SCRIPT_OUTPUT_DIR} - Corresponds to USER_SCRIPT_OUTPUT_DIR in the METplus configuration file.

  • ${USER_SCRIPT_COMMAND} - Arguments needed to hrd_frd_sonde_find_tar.py corresponds to USER_SCRIPT_INPUT_TEMPLATE.

  • ${USER_SCRIPT_INPUT_TEMPLATE} - Input template to hrd_frd_sonde_find_tar.py: USER_SCRIPT_INPUT_DIR, valid date (%Y%m%d), and USER_SCRIPT_OUTPUT_DIR.

MET Configuration

METplus sets environment variables based on user settings in the METplus configuration file. See How METplus controls MET config file settings for more details.

YOU SHOULD NOT SET ANY OF THESE ENVIRONMENT VARIABLES YOURSELF! THEY WILL BE OVERWRITTEN BY METPLUS WHEN IT CALLS THE MET TOOLS!

If there is a setting in the MET configuration file that is currently not supported by METplus you’d like to control, please refer to: Overriding Unsupported MET config file settings

Ascii2NcConfig_wrapped

Note

See the ASCII2NC MET Configuration section of the User’s Guide for more information on the environment variables used in the file below:

////////////////////////////////////////////////////////////////////////////////
//
// Default ascii2nc configuration file
//
////////////////////////////////////////////////////////////////////////////////

//
// The parameters listed below are used to summarize the ASCII data read in
//

//
// Time periods for the summarization
// obs_var (string array) is added and works like grib_code (int array)
// when the obs name is given instead of grib_code
//
${METPLUS_TIME_SUMMARY_DICT}

//
// Mapping of input little_r report types to output message types
//
message_type_map = [
   { key = "FM-12 SYNOP";  val = "ADPSFC"; },
   { key = "FM-13 SHIP";   val = "SFCSHP"; },
   { key = "FM-15 METAR";  val = "ADPSFC"; },
   { key = "FM-18 BUOY";   val = "SFCSHP"; },
   { key = "FM-281 QSCAT"; val = "ASCATW"; },
   { key = "FM-32 PILOT";  val = "ADPUPA"; },
   { key = "FM-35 TEMP";   val = "ADPUPA"; },
   { key = "FM-88 SATOB";  val = "SATWND"; },
   { key = "FM-97 ACARS";  val = "AIRCFT"; }
];

//
// Indicate a version number for the contents of this configuration file.
// The value should generally not be modified.
//
//version = "V10.0";

${METPLUS_MET_CONFIG_OVERRIDES}

PointStatConfig_wrapped

Note

See the PointStat MET Configuration section of the User’s Guide for more information on the environment variables used in the file below:

////////////////////////////////////////////////////////////////////////////////
//
// Point-Stat configuration file.
//
// For additional information, see the MET_BASE/config/README file.
//
////////////////////////////////////////////////////////////////////////////////

//
// Output model name to be written
//
// model =
${METPLUS_MODEL}

//
// Output description to be written
// May be set separately in each "obs.field" entry
//
// desc =
${METPLUS_DESC}

////////////////////////////////////////////////////////////////////////////////

//
// Verification grid
//
// regrid = {
${METPLUS_REGRID_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// May be set separately in each "field" entry
//
censor_thresh = [];
censor_val    = [];
cat_thresh    = [ NA ];
cnt_thresh    = [ NA ];
cnt_logic     = UNION;
wind_thresh   = [ NA ];
wind_logic    = UNION;
eclv_points   = 0.05;
rank_corr_flag = FALSE;

//
// Forecast and observation fields to be verified
//
fcst = {
  ${METPLUS_FCST_FIELD}
}

obs = {
  ${METPLUS_OBS_FIELD}
}
////////////////////////////////////////////////////////////////////////////////

//
// Point observation filtering options
// May be set separately in each "obs.field" entry
//
// message_type =
${METPLUS_MESSAGE_TYPE}
sid_exc        = [];
//obs_quality =
${METPLUS_OBS_QUALITY}
duplicate_flag = NONE;
obs_summary    = NONE;
obs_perc_value = 50;

//
// Mapping of message type group name to comma-separated list of values.
//
message_type_group_map = [
   { key = "SURFACE"; val = "ADPSFC,SFCSHP,MSONET";               },
   { key = "ANYAIR";  val = "AIRCAR,AIRCFT";                      },
   { key = "ANYSFC";  val = "ADPSFC,SFCSHP,ADPUPA,PROFLR,MSONET"; },
   { key = "ONLYSF";  val = "ADPSFC,SFCSHP";                      },
   { key = "LANDSF";  val = "ADPSFC,MSONET";                      },
   { key = "WATERSF"; val = "SFCSHP";                             }
];

////////////////////////////////////////////////////////////////////////////////

//
// Climatology data
//
//climo_mean = {
${METPLUS_CLIMO_MEAN_DICT}


//climo_stdev = {
${METPLUS_CLIMO_STDEV_DICT}

//
// May be set separately in each "obs.field" entry
//
//climo_cdf = {
${METPLUS_CLIMO_CDF_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Point observation time window
//
// obs_window = {
${METPLUS_OBS_WINDOW_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Verification masking regions
//
mask = {
   ${METPLUS_MASK_GRID}
   ${METPLUS_MASK_POLY}
   ${METPLUS_MASK_SID}
   llpnt = [];
}

////////////////////////////////////////////////////////////////////////////////

//
// Confidence interval settings
//
ci_alpha  = [ 0.05 ];

boot = {
   interval = PCTILE;
   rep_prop = 1.0;
   n_rep    = 0;
   rng      = "mt19937";
   seed     = "";
}

////////////////////////////////////////////////////////////////////////////////

//
// Interpolation methods
//
//interp = {
${METPLUS_INTERP_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// HiRA verification method
//
hira = {
   flag       = FALSE;
   width      = [ 2, 3, 4, 5 ];
   vld_thresh = 1.0;
   cov_thresh = [ ==0.25 ];
   shape      = SQUARE;
}

////////////////////////////////////////////////////////////////////////////////

//
// Statistical output types
//
//output_flag = {
${METPLUS_OUTPUT_FLAG_DICT}

////////////////////////////////////////////////////////////////////////////////

tmp_dir        = "/tmp";
// output_prefix =
${METPLUS_OUTPUT_PREFIX}
//version        = "V10.0.0";

////////////////////////////////////////////////////////////////////////////////

${METPLUS_MET_CONFIG_OVERRIDES}

Python Embedding

This use case uses two Python embedding scripts: one to download the data (hrd_frd_sonde_find_tar.py) and the other to process it (hrd_frd_sonde_for_ascii2nc.py).

parm/use_cases/model_applications/tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF/hrd_frd_sonde_find_tar.py

#! /usr/bin/env python3
#####################################################################
#  This script will untar the FRD formatted dropsonde tar files from 
#  https://www.aoml.noaa.gov/hrd/data_sub/dropsonde.htmli
#  The untarred files will be downloaded in to a direcory
#  under USER_SCRIPT_OUTPUT_DIR. Arguments to the scripts includes
#  directory where the tar files exists, the user specified 
#  date in YYYYMMDD, and output directory
#  Author: biswas@ucar.edu
#####################################################################

import sys
import os
import glob
import tarfile

if len(sys.argv) == 4:
  path = sys.argv[1]
  date = sys.argv[2]
  outdir = sys.argv[3]

  if os.path.exists(path):
     print("Directory exists: "+ path)

     for name in glob.glob(path+'/'+str(date)+'*FRD.tar.gz'):
       print (name)

       drop_tar = tarfile.open(name)
       drop_tar.extractall(outdir + '/'+str(date))
       drop_files = os.listdir(outdir + '/'+str(date))
       print(drop_files)
       drop_tar.close()

  else: 
     print("Directory not present" + path)

else:
  print("ERROR : Must specify exactly one input data directory, date (YYYYMMDD), and output directory.")
  sys.exit(1)
  
####################################################################

parm/use_cases/model_applications/tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF/hrd_frd_sonde_for_ascii2nc.py

########################################################################
#
# Description:
#   Prepare HRD FRD (full-resolution data) dropsonde files for further
#   processing by the ascii2nc tool in MET.
#   Source: https://www.aoml.noaa.gov/hrd/data_sub/dropsonde.html
#
# Date:
#   December 2020
#
########################################################################

import re
import os
import sys
import numpy as np
import itertools
import datetime as dt
from datetime import datetime, timedelta
import pandas as pd

# Check arguments
if len(sys.argv) == 2:
  input_dir = os.path.expandvars(sys.argv[1])
  print("Input Dir:\t" + repr(input_dir))
else:
  print("ERROR:", sys.argv[0],
        "-> Must specify exactly one input file.")
  sys.exit(1)

# Empty object
my_data = pd.DataFrame()

for filename in sorted(os.listdir(input_dir)):
   input_file = os.path.join(input_dir, filename)

   # Open file
   with open(input_file, 'r') as file_handle:
      lines = file_handle.read().splitlines()
   readdata = False
   for idx, line in enumerate(lines):

    # Extract date, time and sonde info
       match_date = re.match(r'^ Date:(.*)', line)
       match_time = re.match(r'^ Time:(.*)', line)
       match_sonde = re.match(r'^ SID:(.*)', line)

       if match_date:
         date_items = match_date.group(1).split()[:1]
         lat = match_date.group(1).split()[:4]
       if match_time:
         time_items = match_time.group(1).split()[:1]
         lon = match_time.group(1).split()[:4]
       if match_sonde:
         sonde = match_sonde.group(1).split()[0]

         # Format the date and time
         date_formatted = \
           f"{date_items[0][:2]}{date_items[0][2:4]}{date_items[0][4:6]}_" +\
           f"{time_items[0][:2]}:{time_items[0][2:4]}:{time_items[0][4:6]}"
         valid_time = \
           dt.datetime.strptime(date_formatted, "%y%m%d_%H:%M:%S")
         print(f"Valid Time:\t{valid_time}")
       if line.startswith("IX"):
           readdata = True
           continue
       if not readdata:
           continue
       line    = line.strip()
       columns = line.split()
       dsec    = str(columns[1])     # time elasp (s)
       pres    = float(columns[2])   # pressure (mb)
       temp    = float(columns[3])   # temperature (C)
       temp    = temp + 273.15        # convert deg C to K
       relh    = float(columns[4])   # relative humidity (%)
       geop    = int(columns[5])     # geopotential mass height (m)
       wind_dir = int(columns[6])    # wind direction (E)
       wind_spd = float(columns[7])  # wind speed (m/s)
       wind_z   = float(columns[8])  # zonal wind (m/s)
       wind_m   = float(columns[9])  # meridional wind (m/s)
       wind_w   = float(columns[11]) # vertical velocity (m/s)
       zw       = int(columns[12])   # geopotential wind height (m)
       lat      = float(columns[17]) # lat (N)
       lon      = float(columns[18]) # lon (E)
       vld      = valid_time + dt.timedelta(seconds=float(dsec))

       # Skip line if dsec, lat, or lon are missing.
       # Or if pres and geop are missing.
       if dsec == -999.0 or lat == -999.0 or lon == -999.0 or +\
           (pres == -999.0 and geop == -999):
          continue

       # Store valid time in YYYYMMDD_HHMMSS format
       t_vld = vld.strftime('%Y%m%d_%H%M%S')

       # Flag values for the station elevation and qc
       elv = "-9999"
       qc  = "-9999"

       # Append observations for this line
       # Name variable using GRIB conventions:
       #   https://www.nco.ncep.noaa.gov/pmb/docs/on388/table2.html
       if temp != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "TMP", pres, geop, qc, temp]])))

       if relh != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "RH", pres, geop, qc, relh]])))

       if geop != -999.0 and pres != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "HGT", pres, geop, qc, geop]])))

       if wind_dir != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "WDIR", pres, zw, qc, wind_dir]])))

       if wind_spd != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "WIND", pres, zw, qc, wind_spd]])))

       if wind_z != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "UGRD", pres, zw, qc, wind_z]])))

       if wind_m != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "VGRD", pres, zw, qc, wind_m]])))

       if wind_w != -999.0:
          my_data = my_data.append(pd.DataFrame(np.array(
            [["ADPUPA", str(sonde), t_vld, lat, lon, elv, \
              "DZDT", pres, zw, qc, wind_w]])))

# Prepare point_data object for ascii2nc
point_data = my_data.values.tolist()
print("Data Length:\t" + repr(len(point_data)))
print("Data Type:\t" + repr(type(point_data)))

Running METplus

This use case can be run two ways:

  1. Passing in UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF.conf then a user-specific system configuration file:

    run_metplus.py -c /path/to/METplus/parm/use_cases/model_applications//tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF.conf -c /path/to/user_system.conf
    
  2. Modifying the configurations in parm/metplus_config, then passing in UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF.conf:

    run_metplus.py -c /path/to/METplus/parm/use_cases/model_applications/tc_and_extra_tc/UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF.conf

The former method is recommended. Whether you add them to a user-specific configuration file or modify the metplus_config files, the following variables must be set correctly:

  • INPUT_BASE - Path to directory where sample data tarballs are unpacked (See Datasets section to obtain tarballs). This is not required to run METplus, but it is required to run the examples in parm/use_cases

  • OUTPUT_BASE - Path where METplus output will be written. This must be in a location where you have write permissions

  • MET_INSTALL_DIR - Path to location where MET is installed locally

Example User Configuration File:

[dir]
INPUT_BASE = /path/to/sample/input/data
OUTPUT_BASE = /path/to/output/dir
MET_INSTALL_DIR = /path/to/met-X.Y

NOTE: All of these items must be found under the [dir] section.

Expected Output

A successful run will output the following both to the screen and to the logfile:

INFO: METplus has successfully finished running.

Refer to the value set for OUTPUT_BASE to find where the output data was generated. Output for this use case will be found in nam (relative to OUTPUT_BASE) and will contain the following files:

  • point_stat_180000L_20190829_120000V.stat

  • point_stat_180000L_20190829_120000V_fho.txt

  • point_stat_180000L_20190829_120000V_eclv.txt

  • point_stat_180000L_20190829_120000V_ctc.txt

  • point_stat_180000L_20190829_120000V_cnt.txt

  • point_stat_180000L_20190829_120000V_mpr.txt

Keywords

sphinx_gallery_thumbnail_path = ‘_static/tc_and_extra_tc-UserScript_ASCII2NC_PointStat_fcstHAFS_obsFRD_NetCDF.png’

Total running time of the script: ( 0 minutes 0.000 seconds)

Gallery generated by Sphinx-Gallery