GridStat: Python Embedding for sea surface salinity using level 3, 8 day mean obs

model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss.conf

Scientific Objective

This use case utilizes Python embedding to extract several statistics from the sea surface salinity data over the globe, which was already being done in a closed system. By producing the same output via METplus, this use case provides standardization and reproducible results.

Datasets

Forecast: RTOFS sss file via Python Embedding script/file
Observations: SMAP sss file via Python Embedding script/file
Sea Ice Masking: RTOFS ice cover file via Python Embedding script/file
Climatology: WOA sss file via Python Embedding script/file
Location: All of the input data required for this use case can be found in the met_test sample data tarball. Click here to the METplus releases page and download sample data for the appropriate release: https://github.com/dtcenter/METplus/releases
This tarball should be unpacked into the directory that you will set the value of INPUT_BASE. See Running METplus section for more information.
Data Source: JPL’s PODAAC and NCEP’s FTPPRD data servers

External Dependencies

You will need to use a version of Python 3.6+ that has the following packages installed:

  • scikit-learn

  • pyresample

If the version of Python used to compile MET did not have these libraries at the time of compilation, you will need to add these packages or create a new Python environment with these packages.

If this is the case, you will need to set the MET_PYTHON_EXE environment variable to the path of the version of Python you want to use. If you want this version of Python to only apply to this use case, set it in the [user_env_vars] section of a METplus configuration file.:

[user_env_vars] MET_PYTHON_EXE = /path/to/python/with/required/packages/bin/python

METplus Components

This use case utilizes the METplus GridStat wrapper to generate a command to run the MET tool GridStat with Python Embedding for the specified user hemispheres

METplus Workflow

GridStat is the only tool called in this example. This use case will pass in both the observation, forecast, and climatology gridded data being pulled from the files via Python Embedding. All of the desired statistics reside in the CNT line type, so that is the only output requested. It processes the following run time:

Valid: 2021-05-02 0Z

METplus Configuration

METplus first loads all of the configuration files found in parm/metplus_config, then it loads any configuration files passed to METplus via the command line with the -c option, i.e. -c parm/use_cases/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss.conf

# GridStat METplus Configuration

# section heading for [config] variables - all items below this line and
# before the next section heading correspond to the [config] section
[config]

# List of applications to run - only GridStat for this case
PROCESS_LIST = GridStat

# time looping - options are INIT, VALID, RETRO, and REALTIME
# If set to INIT or RETRO:
#   INIT_TIME_FMT, INIT_BEG, INIT_END, and INIT_INCREMENT must also be set
# If set to VALID or REALTIME:
#   VALID_TIME_FMT, VALID_BEG, VALID_END, and VALID_INCREMENT must also be set
LOOP_BY = VALID

# Format of INIT_BEG and INT_END using % items
# %Y = 4 digit year, %m = 2 digit month, %d = 2 digit day, etc.
# see www.strftime.org for more information
# %Y%m%d%H expands to YYYYMMDDHH
VALID_TIME_FMT = %Y%m%d

# Start time for METplus run - must match INIT_TIME_FMT
VALID_BEG=20210502

# End time for METplus run - must match INIT_TIME_FMT
VALID_END=20210502

# Increment between METplus runs (in seconds if no units are specified)
#  Must be >= 60 seconds
VALID_INCREMENT = 1M

# List of forecast leads to process for each run time (init or valid)
# In hours if units are not specified
# If unset, defaults to 0 (don't loop through forecast leads)
LEAD_SEQ = 24


# Order of loops to process data - Options are times, processes
# Not relevant if only one item is in the PROCESS_LIST
# times = run all wrappers in the PROCESS_LIST for a single run time, then
#   increment the run time and run all wrappers again until all times have
#   been evaluated.
# processes = run the first wrapper in the PROCESS_LIST for all times
#   specified, then repeat for the next item in the PROCESS_LIST until all
#   wrappers have been run
LOOP_ORDER = times

# Verbosity of MET output - overrides LOG_VERBOSITY for GridStat only
LOG_GRID_STAT_VERBOSITY = 2

# Location of MET config file to pass to GridStat
GRID_STAT_CONFIG_FILE = {PARM_BASE}/met_config/GridStatConfig_wrapped

# grid to remap data. Value is set as the 'to_grid' variable in the 'regrid' dictionary
# See MET User's Guide for more information
GRID_STAT_REGRID_TO_GRID = NONE

#GRID_STAT_INTERP_FIELD =
#GRID_STAT_INTERP_VLD_THRESH =
#GRID_STAT_INTERP_SHAPE =
#GRID_STAT_INTERP_TYPE_METHOD =
#GRID_STAT_INTERP_TYPE_WIDTH =

#GRID_STAT_NC_PAIRS_VAR_NAME =

#GRID_STAT_CLIMO_MEAN_TIME_INTERP_METHOD =
#GRID_STAT_CLIMO_STDEV_TIME_INTERP_METHOD =

#GRID_STAT_GRID_WEIGHT_FLAG = AREA

# Name to identify model (forecast) data in output
MODEL = RTOFS

# Name to identify observation data in output
OBTYPE = SMAP

# set the desc value in the GridStat MET config file
GRID_STAT_DESC = NA

# List of variables to compare in GridStat - FCST_VAR1 variables correspond
#  to OBS_VAR1 variables
# Note [FCST/OBS/BOTH]_GRID_STAT_VAR<n>_NAME can be used instead if different evaluations
# are needed for different tools

# Name of forecast variable 1
FCST_VAR1_NAME = {CONFIG_DIR}/read_rtofs_smap_woa.py {INPUT_BASE}/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss/{init?fmt=%Y%m%d}_rtofs_glo_2ds_f024_prog.nc {INPUT_BASE}/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss/SMAP-L3-GLOB_{valid?fmt=%Y%m%d?shift=86400}.nc {INPUT_BASE}/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss/OSTIA-UKMO-L4-GLOB-v2.0_{valid?fmt=%Y%m%d}.nc {INPUT_BASE}/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss {valid?fmt=%Y%m%d} fcst

# List of levels to evaluate for forecast variable 1
# A03 = 3 hour accumulation in GRIB file
FCST_VAR1_LEVELS = 

# List of thresholds to evaluate for each name/level combination for
#  forecast variable 1
FCST_VAR1_THRESH =

#FCST_GRID_STAT_FILE_TYPE =

# Name of observation variable 1
OBS_VAR1_NAME = {CONFIG_DIR}/read_rtofs_smap_woa.py {INPUT_BASE}/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss/{init?fmt=%Y%m%d}_rtofs_glo_2ds_f024_prog.nc {INPUT_BASE}/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss/SMAP-L3-GLOB_{valid?fmt=%Y%m%d?shift=86400}.nc {INPUT_BASE}/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss/OSTIA-UKMO-L4-GLOB-v2.0_{valid?fmt=%Y%m%d}.nc {INPUT_BASE}/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss {valid?fmt=%Y%m%d} obs


# List of levels to evaluate for observation variable 1
# (*,*) is NetCDF notation - must include quotes around these values!
# must be the same length as FCST_VAR1_LEVELS
OBS_VAR1_LEVELS = 

# List of thresholds to evaluate for each name/level combination for
#  observation variable 1
OBS_VAR1_THRESH = 

#GRID_STAT_MET_CONFIG_OVERRIDES = cat_thresh = [>=0.15];
#BOTH_VAR1_THRESH = >=0.15

#OBS_GRID_STAT_FILE_TYPE =


# Name of climatology variable 1
GRID_STAT_CLIMO_MEAN_FIELD = {name="{CONFIG_DIR}/read_rtofs_smap_woa.py {INPUT_BASE}/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss/{init?fmt=%Y%m%d}_rtofs_glo_2ds_f024_prog.nc {INPUT_BASE}/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss/SMAP-L3-GLOB_{valid?fmt=%Y%m%d?shift=86400}.nc {INPUT_BASE}/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss/OSTIA-UKMO-L4-GLOB-v2.0_{valid?fmt=%Y%m%d}.nc {INPUT_BASE}/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss {valid?fmt=%Y%m%d} climo"; level="(*,*)";}


# Time relative to valid time (in seconds) to allow files to be considered
#  valid. Set both BEGIN and END to 0 to require the exact time in the filename
#  Not used in this example.
FCST_GRID_STAT_FILE_WINDOW_BEGIN = 0
FCST_GRID_STAT_FILE_WINDOW_END = 0
OBS_GRID_STAT_FILE_WINDOW_BEGIN = 0
OBS_GRID_STAT_FILE_WINDOW_END = 0

# MET GridStat neighborhood values
# See the MET User's Guide GridStat section for more information

# width value passed to nbrhd dictionary in the MET config file
GRID_STAT_NEIGHBORHOOD_WIDTH = 1

# shape value passed to nbrhd dictionary in the MET config file
GRID_STAT_NEIGHBORHOOD_SHAPE = SQUARE

# cov thresh list passed to nbrhd dictionary in the MET config file
GRID_STAT_NEIGHBORHOOD_COV_THRESH = >=0.5

# Set to true to run GridStat separately for each field specified
# Set to false to create one run of GridStat per run time that
#   includes all fields specified.
GRID_STAT_ONCE_PER_FIELD = False

# Set to true if forecast data is probabilistic
FCST_IS_PROB = false

# Only used if FCST_IS_PROB is true - sets probabilistic threshold
FCST_GRID_STAT_PROB_THRESH = ==0.1

# Set to true if observation data is probabilistic
#  Only used if configuring forecast data as the 'OBS' input
OBS_IS_PROB = false

# Only used if OBS_IS_PROB is true - sets probabilistic threshold
OBS_GRID_STAT_PROB_THRESH = ==0.1

GRID_STAT_OUTPUT_PREFIX = SSS

#GRID_STAT_CLIMO_MEAN_FILE_NAME =
#GRID_STAT_CLIMO_MEAN_FIELD =
#GRID_STAT_CLIMO_MEAN_REGRID_METHOD =
#GRID_STAT_CLIMO_MEAN_REGRID_WIDTH =
#GRID_STAT_CLIMO_MEAN_REGRID_VLD_THRESH =
#GRID_STAT_CLIMO_MEAN_REGRID_SHAPE =
#GRID_STAT_CLIMO_MEAN_TIME_INTERP_METHOD =
#GRID_STAT_CLIMO_MEAN_MATCH_MONTH =
#GRID_STAT_CLIMO_MEAN_DAY_INTERVAL =
#GRID_STAT_CLIMO_MEAN_HOUR_INTERVAL =

#GRID_STAT_CLIMO_STDEV_FILE_NAME =
#GRID_STAT_CLIMO_STDEV_FIELD =
#GRID_STAT_CLIMO_STDEV_REGRID_METHOD =
#GRID_STAT_CLIMO_STDEV_REGRID_WIDTH =
#GRID_STAT_CLIMO_STDEV_REGRID_VLD_THRESH =
#GRID_STAT_CLIMO_STDEV_REGRID_SHAPE =
#GRID_STAT_CLIMO_STDEV_TIME_INTERP_METHOD =
#GRID_STAT_CLIMO_STDEV_MATCH_MONTH =
#GRID_STAT_CLIMO_STDEV_DAY_INTERVAL =
#GRID_STAT_CLIMO_STDEV_HOUR_INTERVAL =


#GRID_STAT_CLIMO_CDF_BINS = 1
#GRID_STAT_CLIMO_CDF_CENTER_BINS = False
#GRID_STAT_CLIMO_CDF_WRITE_BINS = True

#GRID_STAT_OUTPUT_FLAG_FHO = NONE
#GRID_STAT_OUTPUT_FLAG_CTC = NONE
#GRID_STAT_OUTPUT_FLAG_CTS = NONE
#GRID_STAT_OUTPUT_FLAG_MCTC = NONE
#GRID_STAT_OUTPUT_FLAG_MCTS = NONE
GRID_STAT_OUTPUT_FLAG_CNT = BOTH
#GRID_STAT_OUTPUT_FLAG_SL1L2 = NONE
#GRID_STAT_OUTPUT_FLAG_SAL1L2 = NONE
#GRID_STAT_OUTPUT_FLAG_VL1L2 = NONE
#GRID_STAT_OUTPUT_FLAG_VAL1L2 = NONE
#GRID_STAT_OUTPUT_FLAG_VCNT = NONE
#GRID_STAT_OUTPUT_FLAG_PCT = NONE
#GRID_STAT_OUTPUT_FLAG_PSTD = NONE
#GRID_STAT_OUTPUT_FLAG_PJC = NONE
#GRID_STAT_OUTPUT_FLAG_PRC = NONE
#GRID_STAT_OUTPUT_FLAG_ECLV = BOTH
#GRID_STAT_OUTPUT_FLAG_NBRCTC = NONE
#GRID_STAT_OUTPUT_FLAG_NBRCTS = NONE
#GRID_STAT_OUTPUT_FLAG_NBRCNT = NONE
#GRID_STAT_OUTPUT_FLAG_GRAD = BOTH
#GRID_STAT_OUTPUT_FLAG_DMAP = NONE

#GRID_STAT_NC_PAIRS_FLAG_LATLON = FALSE
#GRID_STAT_NC_PAIRS_FLAG_RAW = FALSE
#GRID_STAT_NC_PAIRS_FLAG_DIFF = FALSE
#GRID_STAT_NC_PAIRS_FLAG_CLIMO = FALSE
#GRID_STAT_NC_PAIRS_FLAG_CLIMO_CDP = FALSE
#GRID_STAT_NC_PAIRS_FLAG_WEIGHT = FALSE
#GRID_STAT_NC_PAIRS_FLAG_NBRHD = FALSE
#GRID_STAT_NC_PAIRS_FLAG_FOURIER = FALSE
#GRID_STAT_NC_PAIRS_FLAG_GRADIENT = FALSE
#GRID_STAT_NC_PAIRS_FLAG_DISTANCE_MAP = FALSE
#GRID_STAT_NC_PAIRS_FLAG_APPLY_MASK = FALSE


# End of [config] section and start of [dir] section
[dir]
#use case configuration file directory
CONFIG_DIR = {PARM_BASE}/use_cases/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss
# directory containing forecast input to GridStat
FCST_GRID_STAT_INPUT_DIR = 

# directory containing observation input to GridStat
OBS_GRID_STAT_INPUT_DIR = 

# directory containing climatology mean input to GridStat
# Not used in this example
GRID_STAT_CLIMO_MEAN_INPUT_DIR =

# directory containing climatology mean input to GridStat
# Not used in this example
GRID_STAT_CLIMO_STDEV_INPUT_DIR =

# directory to write output from GridStat
GRID_STAT_OUTPUT_DIR = {OUTPUT_BASE}

# End of [dir] section and start of [filename_templates] section
[filename_templates]

# Template to look for forecast input to GridStat relative to FCST_GRID_STAT_INPUT_DIR
FCST_GRID_STAT_INPUT_TEMPLATE = PYTHON_NUMPY

# Template to look for observation input to GridStat relative to OBS_GRID_STAT_INPUT_DIR
OBS_GRID_STAT_INPUT_TEMPLATE = PYTHON_NUMPY

# Optional subdirectories relative to GRID_STAT_OUTPUT_DIR to write output from GridStat
GRID_STAT_OUTPUT_TEMPLATE = {valid?fmt=%Y%m%d}

# Template to look for climatology input to GridStat relative to GRID_STAT_CLIMO_MEAN_INPUT_DIR
# Not used in this example
GRID_STAT_CLIMO_MEAN_INPUT_TEMPLATE = PYTHON_NUMPY

# Template to look for climatology input to GridStat relative to GRID_STAT_CLIMO_STDEV_INPUT_DIR
# Not used in this exampls
GRID_STAT_CLIMO_STDEV_INPUT_TEMPLATE =

# Used to specify one or more verification mask files for GridStat
# Not used for this example
GRID_STAT_VERIFICATION_MASK_TEMPLATE =

MET Configuration

METplus sets environment variables based on user settings in the METplus configuration file. See How METplus controls MET config file settings for more details.

YOU SHOULD NOT SET ANY OF THESE ENVIRONMENT VARIABLES YOURSELF! THEY WILL BE OVERWRITTEN BY METPLUS WHEN IT CALLS THE MET TOOLS!

If there is a setting in the MET configuration file that is currently not supported by METplus you’d like to control, please refer to: Overriding Unsupported MET config file settings

Note

See the GridStat MET Configuration section of the User’s Guide for more information on the environment variables used in the file below:

////////////////////////////////////////////////////////////////////////////////
//
// Grid-Stat configuration file.
//
// For additional information, see the MET_BASE/config/README file.
//
////////////////////////////////////////////////////////////////////////////////

//
// Output model name to be written
//
// model =
${METPLUS_MODEL}

//
// Output description to be written
// May be set separately in each "obs.field" entry
//
// desc =
${METPLUS_DESC}

//
// Output observation type to be written
//
// obtype =
${METPLUS_OBTYPE}

////////////////////////////////////////////////////////////////////////////////

//
// Verification grid
//
// regrid = {
${METPLUS_REGRID_DICT}

////////////////////////////////////////////////////////////////////////////////

//censor_thresh =
${METPLUS_CENSOR_THRESH}
//censor_val =
${METPLUS_CENSOR_VAL}
cat_thresh  	 = [];
cnt_thresh  	 = [ NA ];
cnt_logic   	 = UNION;
wind_thresh 	 = [ NA ];
wind_logic  	 = UNION;
eclv_points      = 0.05;
//nc_pairs_var_name =
${METPLUS_NC_PAIRS_VAR_NAME}
nc_pairs_var_suffix = "";
//hss_ec_value =
${METPLUS_HSS_EC_VALUE}

rank_corr_flag   = FALSE;

//
// Forecast and observation fields to be verified
//
fcst = {
  ${METPLUS_FCST_FILE_TYPE}
  ${METPLUS_FCST_FIELD}
}
obs = {
  ${METPLUS_OBS_FILE_TYPE}
  ${METPLUS_OBS_FIELD}
}

////////////////////////////////////////////////////////////////////////////////

//
// Climatology mean data
//
//climo_mean = {
${METPLUS_CLIMO_MEAN_DICT}


//climo_stdev = {
${METPLUS_CLIMO_STDEV_DICT}

//
// May be set separately in each "obs.field" entry
//
//climo_cdf = {
${METPLUS_CLIMO_CDF_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Verification masking regions
//
// mask = {
${METPLUS_MASK_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Confidence interval settings
//
ci_alpha  = [ 0.05 ];

boot = {
   interval = PCTILE;
   rep_prop = 1.0;
   n_rep    = 0;
   rng      = "mt19937";
   seed     = "";
}

////////////////////////////////////////////////////////////////////////////////

//
// Data smoothing methods
//
//interp = {
${METPLUS_INTERP_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Neighborhood methods
//
nbrhd = {
   field      = BOTH;
   // shape =
   ${METPLUS_NBRHD_SHAPE}
   // width =
   ${METPLUS_NBRHD_WIDTH}
   // cov_thresh =
   ${METPLUS_NBRHD_COV_THRESH}
   vld_thresh = 1.0;
}

////////////////////////////////////////////////////////////////////////////////

//
// Fourier decomposition
// May be set separately in each "obs.field" entry
//
//fourier = {
${METPLUS_FOURIER_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Gradient statistics
// May be set separately in each "obs.field" entry
//
gradient = {
   dx = [ 1 ];
   dy = [ 1 ];
}

////////////////////////////////////////////////////////////////////////////////

//
// Distance Map statistics
// May be set separately in each "obs.field" entry
//
//distance_map = {
${METPLUS_DISTANCE_MAP_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Statistical output types
//
//output_flag = {
${METPLUS_OUTPUT_FLAG_DICT}

//
// NetCDF matched pairs output file
// May be set separately in each "obs.field" entry
//
// nc_pairs_flag = {
${METPLUS_NC_PAIRS_FLAG_DICT}

////////////////////////////////////////////////////////////////////////////////

//grid_weight_flag =
${METPLUS_GRID_WEIGHT_FLAG}

tmp_dir = "${MET_TMP_DIR}";

// output_prefix =
${METPLUS_OUTPUT_PREFIX}

////////////////////////////////////////////////////////////////////////////////

${METPLUS_MET_CONFIG_OVERRIDES}

Python Embedding

This use case uses one Python script to read forecast and observation data

parm/use_cases/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss/read_rtofs_smap_woa.py

#!/bin/env python
"""
Code adapted from
Todd Spindler
NOAA/NWS/NCEP/EMC
Designed to read in RTOFS,SMAP,WOA and OSTIA data
and based on user input, read sss data 
and pass back in memory the forecast, observation, or climatology
data field
"""

import numpy as np
import xarray as xr
import pandas as pd
import pyresample as pyr
from pandas.tseries.offsets import DateOffset
from datetime import datetime, timedelta
from sklearn.metrics import mean_squared_error
import io
from glob import glob
import warnings
import os, sys


if len(sys.argv) < 6:
    print("Must specify the following elements: fcst_file obs_file ice_file, climo_file, valid_date, file_flag")
    sys.exit(1)

rtofsfile = os.path.expandvars(sys.argv[1]) 
sssfile = os.path.expandvars(sys.argv[2]) 
icefile = os.path.expandvars(sys.argv[3]) 
climoDir = os.path.expandvars(sys.argv[4]) 
vDate=datetime.strptime(sys.argv[5],'%Y%m%d')
file_flag = sys.argv[6] 

print('Starting Satellite SMAP V&V at',datetime.now(),'for',vDate, ' file_flag:',file_flag)

pd.date_range(vDate,vDate)
platform='SMAP'
param='sss'


#####################################################################
# READ SMAP data ##################################################
#####################################################################

if not os.path.exists(sssfile):
        print('missing SMAP file for',vDate)

sss_data=xr.open_dataset(sssfile,decode_times=True)
sss_data['time']=sss_data.time-pd.Timedelta('12H')  # shift 12Z offset time to 00Z
sss_data2=sss_data['sss'].astype('single')
print('Retrieved SMAP data from NESDIS for',sss_data2.time.values)
#sss_data2=sss_data2.rename({'longitude':'lon','latitude':'lat'})


# all coords need to be single precision
sss_data2['lon']=sss_data2.lon.astype('single')
sss_data2['lat']=sss_data2.lat.astype('single')
sss_data2.attrs['platform']=platform
sss_data2.attrs['units']='PSU'

#####################################################################
# READ RTOFS data (model output in Tri-polar coordinates) ###########
#####################################################################

print('reading rtofs ice')
if not os.path.exists(rtofsfile):
    print('missing rtofs file',rtofsfile)
    sys.exit(1)

indata=xr.open_dataset(rtofsfile,decode_times=True)


indata=indata.mean(dim='MT')
indata = indata[param][:-1,]
indata.coords['time']=vDate
#indata.coords['fcst']=fcst

outdata=indata.copy()

outdata=outdata.rename({'Longitude':'lon','Latitude':'lat',})
# all coords need to be single precision
outdata['lon']=outdata.lon.astype('single')
outdata['lat']=outdata.lat.astype('single')
outdata.attrs['platform']='rtofs '+platform

#####################################################################
# READ CLIMO WOA data - May require 2 files depending on the date ###
#####################################################################

if not os.path.exists(climoDir):
        print('missing climo file file for',vDate)

vDate=pd.Timestamp(vDate)

climofile="woa13_decav_s{:02n}_04v2.nc".format(vDate.month)
climo_data=xr.open_dataset(climoDir+'/'+climofile,decode_times=False)
climo_data=climo_data['s_an'].squeeze()[0,]

if vDate.day==15:  # even for Feb, just because
    climofile="woa13_decav_s{:02n}_04v2.nc".format(vDate.month)
    climo_data=xr.open_dataset(climoDir+'/'+climofile,decode_times=False)
    climo_data=climo_data['s_an'].squeeze()[0,]  # surface only
else:
    if vDate.day < 15:
        start=vDate - DateOffset(months=1,day=15)
        stop=pd.Timestamp(vDate.year,vDate.month,15)
    else:
        start=pd.Timestamp(vDate.year,vDate.month,15)
        stop=vDate + DateOffset(months=1,day=15)
    left=(vDate-start)/(stop-start)
        
    climofile1="woa13_decav_s{:02n}_04v2.nc".format(start.month)
    climofile2="woa13_decav_s{:02n}_04v2.nc".format(stop.month)
    climo_data1=xr.open_dataset(climoDir+'/'+climofile1,decode_times=False)
    climo_data2=xr.open_dataset(climoDir+'/'+climofile2,decode_times=False)
    climo_data1=climo_data1['s_an'].squeeze()[0,]  # surface only
    climo_data2=climo_data2['s_an'].squeeze()[0,]  # surface only

    print('climofile1 :', climofile1)
    print('climofile2 :', climofile2)
    climo_data=climo_data1+((climo_data2-climo_data1)*left)
    climofile='weighted average of '+climofile1+' and '+climofile2

# all coords need to be single precision
climo_data['lon']=climo_data.lon.astype('single')
climo_data['lat']=climo_data.lat.astype('single')
climo_data.attrs['platform']='woa'
climo_data.attrs['filename']=climofile

#####################################################################
# READ ICE data for masking #########################################
#####################################################################

if not os.path.exists(icefile):
        print('missing OSTIA ice file for',vDate)

ice_data=xr.open_dataset(icefile,decode_times=True)
ice_data=ice_data.rename({'sea_ice_fraction':'ice'})

# all coords need to be single precision
ice_data2=ice_data.ice.astype('single')
ice_data2['lon']=ice_data2.lon.astype('single')
ice_data2['lat']=ice_data2.lat.astype('single')


def regrid(model,obs):
    """
    regrid data to obs -- this assumes DataArrays
    """
    model2=model.copy()
    model2_lon=model2.lon.values
    model2_lat=model2.lat.values
    model2_data=model2.to_masked_array()
    if model2_lon.ndim==1:
        model2_lon,model2_lat=np.meshgrid(model2_lon,model2_lat)

    obs2=obs.copy()
    obs2_lon=obs2.lon.astype('single').values
    obs2_lat=obs2.lat.astype('single').values
    obs2_data=obs2.astype('single').to_masked_array()
    if obs2.lon.ndim==1:
        obs2_lon,obs2_lat=np.meshgrid(obs2.lon.values,obs2.lat.values)

    model2_lon1=pyr.utils.wrap_longitudes(model2_lon)
    model2_lat1=model2_lat.copy()
    obs2_lon1=pyr.utils.wrap_longitudes(obs2_lon)
    obs2_lat1=obs2_lat.copy()

    # pyresample gausssian-weighted kd-tree interp
    # define the grids
    orig_def = pyr.geometry.GridDefinition(lons=model2_lon1,lats=model2_lat1)
    targ_def = pyr.geometry.GridDefinition(lons=obs2_lon1,lats=obs2_lat1)
    radius=50000
    sigmas=25000
    model2_data2=pyr.kd_tree.resample_gauss(orig_def,model2_data,targ_def,
                                            radius_of_influence=radius,
                                            sigmas=sigmas,
                                            fill_value=None)
    model=xr.DataArray(model2_data2,coords=[obs.lat.values,obs.lon.values],dims=['lat','lon'])

    return model

def expand_grid(data):
    """
    concatenate global data for edge wraps
    """

    data2=data.copy()
    data2['lon']=data2.lon+360
    data3=xr.concat((data,data2),dim='lon')
    return data3

sss_data2=sss_data2.squeeze()

print('regridding climo to obs')
climo_data=climo_data.squeeze()
climo_data=regrid(climo_data,sss_data2)

print('regridding ice to obs')
ice_data2=regrid(ice_data2,sss_data2)

print('regridding model to obs')
model2=regrid(outdata,sss_data2)

# combine obs ice mask with ncep
obs2=sss_data2.to_masked_array()
ice2=ice_data2.to_masked_array()
climo2=climo_data.to_masked_array()
model2=model2.to_masked_array()

#reconcile with obs
obs2.mask=np.ma.mask_or(obs2.mask,ice2>0.0)
obs2.mask=np.ma.mask_or(obs2.mask,climo2.mask)
obs2.mask=np.ma.mask_or(obs2.mask,model2.mask)
climo2.mask=obs2.mask
model2.mask=obs2.mask

obs2=xr.DataArray(obs2,coords=[sss_data2.lat.values,sss_data2.lon.values], dims=['lat','lon'])
model2=xr.DataArray(model2,coords=[sss_data2.lat.values,sss_data2.lon.values], dims=['lat','lon'])
climo2=xr.DataArray(climo2,coords=[sss_data2.lat.values,sss_data2.lon.values], dims=['lat','lon'])

model2=expand_grid(model2)
climo2=expand_grid(climo2)
obs2=expand_grid(obs2)

#Create the MET grids based on the file_flag
if file_flag == 'fcst':
    met_data = model2[:,:]
    #trim the lat/lon grids so they match the data fields
    lat_met = model2.lat
    lon_met = model2.lon
    print(" RTOFS Data shape: "+repr(met_data.shape))
    v_str = vDate.strftime("%Y%m%d")
    v_str = v_str + '_000000'
    lat_ll = float(lat_met.min())
    lon_ll = float(lon_met.min())
    n_lat = lat_met.shape[0]
    n_lon = lon_met.shape[0]
    delta_lat = (float(lat_met.max()) - float(lat_met.min()))/float(n_lat)
    delta_lon = (float(lon_met.max()) - float(lon_met.min()))/float(n_lon)
    print(f"variables:"
            f"lat_ll: {lat_ll} lon_ll: {lon_ll} n_lat: {n_lat} n_lon: {n_lon} delta_lat: {delta_lat} delta_lon: {delta_lon}")
    met_data.attrs = {
            'valid': v_str,
            'init': v_str,
            'lead': "00",
            'accum': "00",
            'name': 'sss',
            'standard_name': 'sea_surface_salinity',
            'long_name': 'sea_surface_salinity',
            'level': "SURFACE",
            'units': "psu",

            'grid': {
                'type': "LatLon",
                'name': "RTOFS Grid",
                'lat_ll': lat_ll,
                'lon_ll': lon_ll,
                'delta_lat': delta_lat,
                'delta_lon': delta_lon,
                'Nlat': n_lat,
                'Nlon': n_lon,
                }
            }
    attrs = met_data.attrs

if file_flag == 'obs':
    met_data = obs2[:,:]
    #trim the lat/lon grids so they match the data fields
    lat_met = obs2.lat
    lon_met = obs2.lon
    v_str = vDate.strftime("%Y%m%d")
    v_str = v_str + '_000000'
    lat_ll = float(lat_met.min())
    lon_ll = float(lon_met.min())
    n_lat = lat_met.shape[0]
    n_lon = lon_met.shape[0]
    delta_lat = (float(lat_met.max()) - float(lat_met.min()))/float(n_lat)
    delta_lon = (float(lon_met.max()) - float(lon_met.min()))/float(n_lon)
    print(f"variables:"
            f"lat_ll: {lat_ll} lon_ll: {lon_ll} n_lat: {n_lat} n_lon: {n_lon} delta_lat: {delta_lat} delta_lon: {delta_lon}")
    met_data.attrs = {
            'valid': v_str,
            'init': v_str,
            'lead': "00",
            'accum': "00",
            'name': 'sss',
            'standard_name': 'analyzed sea surface salinity',
            'long_name': 'sea_surface_salinity',
            'level': "SURFACE",
            'units': "psu",

            'grid': {
                'type': "LatLon",
                'name': "Lat Lon",
                'lat_ll': lat_ll,
                'lon_ll': lon_ll,
                'delta_lat': delta_lat,
                'delta_lon': delta_lon,
                'Nlat': n_lat,
                'Nlon': n_lon,
                }
            }
    attrs = met_data.attrs

if file_flag == 'climo':
    met_data = climo2[:,:]
    #modify the lat and lon grids since they need to match the data dimensions, and code cuts the last row/column of data
    lat_met = climo2.lat
    lon_met = climo2.lon
    v_str = vDate.strftime("%Y%m%d")
    v_str = v_str + '_000000'
    lat_ll = float(lat_met.min())
    lon_ll = float(lon_met.min())
    n_lat = lat_met.shape[0]
    n_lon = lon_met.shape[0]
    delta_lat = (float(lat_met.max()) - float(lat_met.min()))/float(n_lat)
    delta_lon = (float(lon_met.max()) - float(lon_met.min()))/float(n_lon)
    print(f"variables:"
            f"lat_ll: {lat_ll} lon_ll: {lon_ll} n_lat: {n_lat} n_lon: {n_lon} delta_lat: {delta_lat} delta_lon: {delta_lon}")
    met_data.attrs = {
            'valid': v_str,
            'init': v_str,
            'lead': "00",
            'accum': "00",
            'name': 'sea_water_salinity',
            'standard_name': 'sea_water_salinity',
            'long_name': 'sea_water_salinity',
            'level': "SURFACE",
            'units': "psu",

            'grid': {
                'type': "LatLon",
                'name': "crs Grid",
                'lat_ll': lat_ll,
                'lon_ll': lon_ll,
                'delta_lat': delta_lat,
                'delta_lon': delta_lon,
                'Nlat': n_lat,
                'Nlon': n_lon,
                }
            }
    attrs = met_data.attrs

Running METplus

This use case can be run two ways:

  1. Passing in GridStat_fcstRTOFS_obsSMAP_climWOA_sss.conf then a user-specific system configuration file:

    run_metplus.py -c /path/to/METplus/parm/use_cases/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss.conf -c /path/to/user_system.conf
    
  2. Modifying the configurations in parm/metplus_config, then passing in GridStat_fcstRTOFS_obsSMAP_climWOA_sss.conf:

    run_metplus.py -c /path/to/METplus/parm/use_cases/model_applications/marine_and_cryosphere/GridStat_fcstRTOFS_obsSMAP_climWOA_sss.conf
    

The former method is recommended. Whether you add them to a user-specific configuration file or modify the metplus_config files, the following variables must be set correctly:

  • INPUT_BASE - Path to directory where sample data tarballs are unpacked (See Datasets section to obtain tarballs). This is not required to run METplus, but it is required to run the examples in parm/use_cases

  • OUTPUT_BASE - Path where METplus output will be written. This must be in a location where you have write permissions

  • MET_INSTALL_DIR - Path to location where MET is installed locally

Example User Configuration File:

[dir]
INPUT_BASE = /path/to/sample/input/data
OUTPUT_BASE = /path/to/output/dir
MET_INSTALL_DIR = /path/to/met-X.Y

NOTE: All of these items must be found under the [dir] section.

Expected Output

A successful run will output the following both to the screen and to the logfile:

INFO: METplus has successfully finished running.

Refer to the value set for OUTPUT_BASE to find where the output data was generated. Output for thisIce use case will be found in 20210503 (relative to OUTPUT_BASE) and will contain the following files:

  • grid_stat_SSS_000000L_20210502_000000V.stat

  • grid_stat_SSS_000000L_20210502_000000V_cnt.txt

  • grid_stat_SSS_000000L_20210502_000000V_pairs.nc

Keywords

Note

  • GridStatToolUseCase

  • PythonEmbeddingFileUseCase

  • MarineAndCryosphereAppUseCase

Navigate to the METplus Quick Search for Use Cases page to discover other similar use cases.

sphinx_gallery_thumbnail_path = ‘_static/marine_and_cryosphere-GridStat_fcstRTOFS_obsSMAP_climWOA_sss.png’

Total running time of the script: ( 0 minutes 0.000 seconds)

Gallery generated by Sphinx-Gallery