PointStat: read in directory of ASCAT files over user-specified time

model_applications/marine_and_cryosphere/PointStat_fcstGFS_obsASCAT_satelliteWinds.conf

Scientific Objective

To evaluate wind characteristics (direction, speed, and u/v components) from ASCAT data over water bodies, Python embedding is utilized to pull user-selected variables and time frames from a runtime directory. Those point values are then compared to GFS data over two masked regions over ocean regions.

Datasets

Forecast: GFS forecast data for 10-m winds
Observations: ASCAT METOP-B data provided by OPC
Location: All of the input data required for this use case can be found in the met_test sample data tarball. Click here to the METplus releases page and download sample data for the appropriate release: https://github.com/dtcenter/METplus/releases
This tarball should be unpacked into the directory that you will set the value of INPUT_BASE. See Running METplus section for more information.

METplus Components

This use case calls Python Embedding during PointStat, which is the only tool used.

METplus Workflow

PointStat kicks off a Python script execution, which reads in the entire directory passed as an arguement. In the script, the directory’s files are included only if they are between the times that are also passed as an arguement. After these points are passed back to PointStat as the point observation dataset, they are compared to gridded forecast data in pre-created masking regions. MCTC and MCTS line types are output, using thresholds of relevant wind speeds. The use case processes the following run time:

Init: 2023-07-06 00Z 6hr lead

METplus Configuration

METplus first loads all of the configuration files found in parm/metplus_config, then it loads any configuration files passed to METplus via the command line parm/use_cases/model_applications/marine_and_cryosphere/PointStat_fcstGFS_obsASCAT_satelliteWinds.conf

[config]

# Documentation for this use case can be found at
# https://metplus.readthedocs.io/en/latest/generated/model_applications/marine_and_cryosphere/PointStat_fcstGFS_obsASCAT_satelliteWinds.html

# For additional information, please see the METplus Users Guide.
# https://metplus.readthedocs.io/en/latest/Users_Guide

###
# Processes to run
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#process-list
###

PROCESS_LIST = PointStat


###
# Time Info
# LOOP_BY options are INIT, VALID, RETRO, and REALTIME
# If set to INIT or RETRO:
#   INIT_TIME_FMT, INIT_BEG, INIT_END, and INIT_INCREMENT must also be set
# If set to VALID or REALTIME:
#   VALID_TIME_FMT, VALID_BEG, VALID_END, and VALID_INCREMENT must also be set
# LEAD_SEQ is the list of forecast leads to process
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#timing-control
###

LOOP_BY = INIT
INIT_TIME_FMT = %Y%m%d%H%M
INIT_BEG = 202307060000
INIT_END = 202307060000
INIT_INCREMENT = 1M

LEAD_SEQ = 6


###
# File I/O
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#directory-and-filename-template-info
###

FCST_POINT_STAT_INPUT_DIR = {INPUT_BASE}/model_applications/marine_and_cryosphere/PointStat_fcstGFS_obsASCAT_satelliteWinds

FCST_POINT_STAT_INPUT_TEMPLATE = gfs.t00z.pgrb2.0p25.f006

CONFIG_DIR = {PARM_BASE}/use_cases/model_applications/marine_and_cryosphere/PointStat_fcstGFS_obsASCAT_satelliteWinds

OBS_POINT_STAT_INPUT_DIR = 
OBS_POINT_STAT_INPUT_TEMPLATE = PYTHON_NUMPY= {CONFIG_DIR}/read_ASCAT_data.py {INPUT_BASE}/model_applications/marine_and_cryosphere/PointStat_fcstGFS_obsASCAT_satelliteWinds/polar:{valid?fmt=%Y%m%d%H%M?shift=-1800}:{valid?fmt=%Y%m%d%H%M?shift=1800}:SATWND:speed


POINT_STAT_OUTPUT_DIR = {OUTPUT_BASE}/model_applications/marine_and_cryosphere/PointStat_fcstGFS_obsASCAT_satelliteWinds

POINT_STAT_CLIMO_MEAN_INPUT_DIR =
POINT_STAT_CLIMO_MEAN_INPUT_TEMPLATE =

POINT_STAT_CLIMO_STDEV_INPUT_DIR =
POINT_STAT_CLIMO_STDEV_INPUT_TEMPLATE =


###
# Field Info
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#field-info
###

POINT_STAT_ONCE_PER_FIELD = False

FCST_VAR1_NAME = WIND
FCST_VAR1_LEVELS = Z10
FCST_VAR1_THRESH = >=13.9, >=17.2, >=24.5, >=32.7

OBS_VAR1_NAME = speed
OBS_VAR1_LEVELS = Z10
OBS_VAR1_THRESH = >=13.9, >=17.2, >=24.5, >=32.7


###
# PointStat Settings
# https://metplus.readthedocs.io/en/latest/Users_Guide/wrappers.html#pointstat
###

POINT_STAT_CLIMO_MEAN_TIME_INTERP_METHOD = NEAREST
POINT_STAT_INTERP_TYPE_METHOD = NEAREST
POINT_STAT_INTERP_TYPE_WIDTH = 1

POINT_STAT_OUTPUT_FLAG_MCTC = BOTH
POINT_STAT_OUTPUT_FLAG_MCTS = BOTH

OBS_POINT_STAT_WINDOW_BEGIN = -1800
OBS_POINT_STAT_WINDOW_END = 1800

POINT_STAT_OFFSETS = 0

MODEL = GFS

POINT_STAT_DESC = ASCAT
OBTYPE = 

POINT_STAT_REGRID_TO_GRID = NONE
POINT_STAT_REGRID_METHOD = BILIN
POINT_STAT_REGRID_WIDTH = 2

POINT_STAT_OUTPUT_PREFIX =

POINT_STAT_MASK_GRID = 
POINT_STAT_MASK_POLY = {INPUT_BASE}/model_applications/marine_and_cryosphere/PointStat_fcstGFS_obsASCAT_satelliteWinds/NPAC.nc, {INPUT_BASE}/model_applications/marine_and_cryosphere/PointStat_fcstGFS_obsASCAT_satelliteWinds/NATL.poly
POINT_STAT_MASK_SID =

POINT_STAT_MESSAGE_TYPE = SATWND

POINT_STAT_LAND_MASK_FLAG = TRUE

MET Configuration

METplus sets environment variables based on user settings in the METplus configuration file. See How METplus controls MET config file settings for more details.

YOU SHOULD NOT SET ANY OF THESE ENVIRONMENT VARIABLES YOURSELF! THEY WILL BE OVERWRITTEN BY METPLUS WHEN IT CALLS THE MET TOOLS!

If there is a setting in the MET configuration file that is currently not supported by METplus you’d like to control, please refer to: Overriding Unsupported MET config file settings

Note

See the GridStat MET Configuration section of the User’s Guide for more information on the environment variables used in the file below:

////////////////////////////////////////////////////////////////////////////////
//
// Point-Stat configuration file.
//
// For additional information, see the MET_BASE/config/README file.
//
////////////////////////////////////////////////////////////////////////////////

//
// Output model name to be written
//
// model =
${METPLUS_MODEL}

//
// Output description to be written
// May be set separately in each "obs.field" entry
//
// desc =
${METPLUS_DESC}

////////////////////////////////////////////////////////////////////////////////

//
// Verification grid
//
// regrid = {
${METPLUS_REGRID_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// May be set separately in each "field" entry
//
censor_thresh = [];
censor_val    = [];
cat_thresh    = [ NA ];
cnt_thresh    = [ NA ];
cnt_logic     = UNION;
wind_thresh   = [ NA ];
wind_logic    = UNION;
eclv_points   = 0.05;
//hss_ec_value =
${METPLUS_HSS_EC_VALUE}
rank_corr_flag = FALSE;

//
// Forecast and observation fields to be verified
//
fcst = {
  ${METPLUS_FCST_FILE_TYPE}
  //field = [
  ${METPLUS_FCST_FIELD}
}

obs = {
  ${METPLUS_OBS_FILE_TYPE}
  //field = [
  ${METPLUS_OBS_FIELD}
}
////////////////////////////////////////////////////////////////////////////////

//
// Point observation filtering options
// May be set separately in each "obs.field" entry
//
// message_type =
${METPLUS_MESSAGE_TYPE}
sid_exc        = [];

//obs_quality_inc =
${METPLUS_OBS_QUALITY_INC}

//obs_quality_exc =
${METPLUS_OBS_QUALITY_EXC}

//duplicate_flag =
${METPLUS_DUPLICATE_FLAG}

//obs_summary =
${METPLUS_OBS_SUMMARY}

//obs_perc_value =
${METPLUS_OBS_PERC_VALUE}

//
// Mapping of message type group name to comma-separated list of values.
//
//message_type_group_map =
${METPLUS_MESSAGE_TYPE_GROUP_MAP}

////////////////////////////////////////////////////////////////////////////////

//
// Climatology data
//
//climo_mean = {
${METPLUS_CLIMO_MEAN_DICT}


//climo_stdev = {
${METPLUS_CLIMO_STDEV_DICT}

//
// May be set separately in each "obs.field" entry
//
//climo_cdf = {
${METPLUS_CLIMO_CDF_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Land/Sea mask
// For LANDSF message types, only use forecast grid points where land = TRUE.
// For WATERSF message types, only use forecast grid points where land = FALSE.
// land_mask.flag may be set separately in each "obs.field" entry.
//
//land_mask = {
${METPLUS_LAND_MASK_DICT}

//
// Topography
// For SURFACE message types, only use observations where the topo - station
// elevation difference meets the use_obs_thresh threshold.
// For the observations kept, when interpolating forecast data to the
// observation location, only use forecast grid points where the topo - station
// difference meets the interp_fcst_thresh threshold.
// topo_mask.flag may be set separately in each "obs.field" entry.
//
//topo_mask = {
${METPLUS_TOPO_MASK_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Point observation time window
//
// obs_window = {
${METPLUS_OBS_WINDOW_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Verification masking regions
//
//mask = {
${METPLUS_MASK_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Confidence interval settings
//
ci_alpha  = [ 0.05 ];

boot = {
   interval = PCTILE;
   rep_prop = 1.0;
   n_rep    = 0;
   rng      = "mt19937";
   seed     = "";
}

////////////////////////////////////////////////////////////////////////////////

//
// Interpolation methods
//
//interp = {
${METPLUS_INTERP_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// HiRA verification method
//
//hira = {
${METPLUS_HIRA_DICT}

////////////////////////////////////////////////////////////////////////////////
// Threshold for SEEPS p1 (Probability of being dry)

//seeps_p1_thresh =
${METPLUS_SEEPS_P1_THRESH}

////////////////////////////////////////////////////////////////////////////////

//
// Statistical output types
//
//output_flag = {
${METPLUS_OUTPUT_FLAG_DICT}

////////////////////////////////////////////////////////////////////////////////

//ugrid_dataset =
${METPLUS_UGRID_DATASET}

//ugrid_max_distance_km =
${METPLUS_UGRID_MAX_DISTANCE_KM}

//ugrid_coordinates_file =
${METPLUS_UGRID_COORDINATES_FILE}

////////////////////////////////////////////////////////////////////////////////

tmp_dir = "${MET_TMP_DIR}";

// output_prefix =
${METPLUS_OUTPUT_PREFIX}
//version        = "V10.0.0";

////////////////////////////////////////////////////////////////////////////////

${METPLUS_MET_CONFIG_OVERRIDES}

Python Embedding

This use case calls the read_ASCAT_data.py script to read and pass to PointStat the user-requested variable. The script needs 5 inputs in the following order: a path to a directory that contains only ASCAT data of the “ascat_YYYYMMDDHHMMSS_*” string, a start time in YYYYMMDDHHMMSS, an end time in the same format, a message type to code the variables as (currently set for SATWND), and a variable name to read in. Currently the script puts the same station ID to each observation, but there is space in the code describing an alternate method that may be improved upon to allow different sattellites to have their own station IDs. The location of the code is parm/use_cases/model_applications/marine_and_cryosphere/PointStat_fcstGFS_obsASCAT_satelliteWinds/read_ASCAT_data.py

#This script is designed to read in multiple ASCAT input data files, and strip out any/all observation points that land within
#the user's designated beginning and end times. The challenge of this script will be to ensure that each unique lat/lon pair
#recieves a unique station ID, as well as not allowing two different observation values to exist for a unique station ID that
#occur at the same time (different times are OK). The final list of observation points will be passed back to MET for verification.
#This script will also remain as flexible as possible, by not hardcoding the variable type read in, the message type assigned to
#the dataset, or the start and end times.

from netCDF4 import Dataset
import sys
import numpy as np
from os import listdir
from os.path import isfile, join
import datetime as dt

#Users are responsible for passing the following arguements at runtime:
##input directory to files (directory can contain any files, but only those matching the template of ascat_YYYYMMDDHHMMSS_*, where * can be any additional identifier
##start time in YYYYMMDDHHMMSS
##end time in YYYYMMDDHHMMSS
##Message type to code values as (suggested types are WDSATR and SATWND)
##Varible field to read in
#this will check that the correct number of args are passed at runtime. If not, the program exits
if len(sys.argv[1].split(':')) != 5:
    print("Run Command:\n\n read_ascat_data.py /path/to/input/input_files:start_date:end_date:Message_type:Variable\n\nCommands notes:\n-start and end dates are formatted YYYYMMDDHHMMSS\n-Suggested Message_types are WDSATR or SATWND, refer to MET User's Guide for more info\n")
    sys.exit(1)
try:
    input_dir,st_date,en_date,msg_typ,var_typ = sys.argv[1].split(':')
    #files need to be only the ASCAT format. Any other file types will result in an error in the later loop that collects files between the user stated times
    onlyfiles = [f for f in listdir(input_dir) if isfile(join(input_dir, f))]
    #get times into correct format
    st_date = dt.datetime.strptime(st_date,"%Y%m%d%H%S")
    en_date = dt.datetime.strptime(en_date,"%Y%m%d%H%S")
except:
    print("input directory may not be set correctly, dates may not be in correct format. Please recheck")
    sys.exit(1)
#need to loop through all of the files and retain only those that are between the desired times
final_file_list = []
for i in onlyfiles:
    if st_date <= dt.datetime.strptime(i.split('_')[1],"%Y%m%d%H%S00") and en_date >= dt.datetime.strptime(i.split('_')[1],"%Y%m%d%H%S00"):
        final_file_list.append(i)
#establish boolean to figure out first run so arrays are initalized only once
first = True
#now loop through the files and pull out the data
for i in final_file_list:
    f_in = Dataset(input_dir+'/'+i,'r')
    if first:
        obs = np.array(f_in[var_typ][:])
        latitude = np.array(f_in['latitude'][:])
        longitude = np.array(f_in['longitude'][:])
        time = np.array(f_in['time'][:])
        first = False
    else:
        obs = np.append(obs, f_in[var_typ][:])
        latitude = np.append(latitude, f_in['latitude'][:])
        longitude = np.append(longitude, f_in['longitude'][:])
        time = np.append(time, f_in['time'][:])
#convert times to MET times
new_time = []
for i in range(len(time)):
    new_time.append(dt.datetime(1970,1,1) + dt.timedelta(seconds=int(time[i])))
    new_time[i] = new_time[i].strftime("%Y%m%d_%H%M%S")

#ALTERNATE METHOD instead of assigning individual IDs to each latlon pair (which is the commented out section below), this method assigns a value of "1" to each point effectively making every point read in the same station. This is a time saver, as well as a potential path to code by file/satellite type rather than by point on earth.
sid = np.full(len(latitude),"1")

"""    
#INDIVIDUAL ID METHOD this checks the lat lon pairs as station IDs are assigned. We'll create a new numpy array made of strings that are the lat-lon combinations. If there are duplicate strings then there are duplicate lat-lon pairs, and they'll get the same station ID. Otherwise, the station ID will be Unique for each lat-lon pair
lat_str = np.array(latitude, dtype=str)
lon_str = np.array(longitude, dtype=str)
lat_lon_str = np.char.add(lat_str, lon_str)
uniq_lat_lon_str, counts = np.unique(lat_lon_str, return_counts=True)
#   sid = np.full(len(lat_lon_str),"1").tolist()
sid = np.full(len(lat_lon_str),"0")
external_count = 1
#this loop will give each station ID an assigned lat-lon pair
for i in range(len(uniq_lat_lon_str)):
    if(counts[i]) == 1:
        sid[i] = str(external_count)
        external_count += 1
    else:
        holder = np.where(uniq_lat_lon_str[i] == lat_lon_str)
        for j in range(len(holder[0])):
            sid[holder[0][j]] = str(external_count)
        external_count += 1
"""

#get arrays into lists, then create a list of lists
#this also requires creating the last arrays of typ, elv, var, lvl, hgt, and qc
typ = np.full(len(latitude), msg_typ).tolist()
elv = np.zeros(len(latitude),dtype=int).tolist()
var = np.full(len(latitude), var_typ).tolist()
lvl = np.full(len(latitude),1013.25).tolist()
hgt = np.full(len(latitude),10,dtype=int).tolist()
qc = np.full(len(latitude),'NA').tolist()
    
sid = sid.tolist()
vld = new_time
lat = latitude.tolist()
lon = longitude.tolist()
obs = obs.tolist()
l_tuple = list(zip(typ,sid,vld,lat,lon,elv,var,lvl,hgt,qc,obs))
point_data = [list(ele) for ele in l_tuple]

print("Data Length:\t" + repr(len(point_data)))
print("Data Type:\t" + repr(type(point_data)))

#This will print out all of the files that were read in at runtime. If not desired, comment out.
print(final_file_list)
    

Running METplus

Pass the use case configuration file to the run_metplus.py script along with any user-specific system configuration files if desired:

run_metplus.py /path/to/METplus/parm/use_cases/model_applications/marine_and_cryosphere/PointStat_fcstGFS_obsASCAT_satelliteWinds.conf /path/to/user_system.conf

See Running METplus for more information.

Expected Output

A successful run will output the following both to the screen and to the logfile:

INFO: METplus has successfully finished running.

Refer to the value set for OUTPUT_BASE to find where the output data was generated. Output for this use case will be found in model_applications/marine_and_cryosphere/PointStat_fcstGFS_obsASCAT_satelliteWinds (relative to OUTPUT_BASE) and will contain the following files:

  • point_stat_060000L_20230706_060000V_mctc.txt

  • point_stat_060000L_20230706_060000V_mcts.txt

  • point_stat_060000L_20230706_060000V.stat

Keywords

Note

  • PointStatToolUseCase

  • PythonEmbeddingFileUseCase

  • GRIB2FileUseCase

  • MarineAndCryosphereAppUseCase

Navigate to the METplus Quick Search for Use Cases page to discover other similar use cases.

sphinx_gallery_thumbnail_path = ‘_static/marine_and_cryosphere-PointStat_fcstGFS_obsASCAT_satelliteWinds.png’

Total running time of the script: (0 minutes 0.000 seconds)

Gallery generated by Sphinx-Gallery