Note
Go to the end to download the full example code
EnsembleStat: Using Python Embedding for Aerosol Optical Depth
model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod.conf
Scientific Objective
To provide useful statistical information on the relationship between observation data for aersol optical depth (AOD) to an ensemble forecast. These values can be used to help correct ensemble member deviations from observed values.
Datasets
METplus Components
This use case utilizes the METplus EnsembleStat wrapper to read in files using Python Embedding
METplus Workflow
EnsembleStat is the only tool called in this example. It processes a single run time with seven ensemble members. Three of the members do not have data for the AOD field, so EnsembleStat will only process four of the members for statistics.
METplus Configuration
METplus first loads all of the configuration files found in parm/metplus_config, then it loads any configuration files passed to METplus via the command line with the -c option, i.e. -c parm/use_cases/model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod.conf
[config]
# Documentation for this use case can be found at
# https://metplus.readthedocs.io/en/latest/generated/model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod.html
# For additional information, please see the METplus Users Guide.
# https://metplus.readthedocs.io/en/latest/Users_Guide
###
# Processes to run
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#process-list
###
PROCESS_LIST = EnsembleStat
###
# Time Info
# LOOP_BY options are INIT, VALID, RETRO, and REALTIME
# If set to INIT or RETRO:
# INIT_TIME_FMT, INIT_BEG, INIT_END, and INIT_INCREMENT must also be set
# If set to VALID or REALTIME:
# VALID_TIME_FMT, VALID_BEG, VALID_END, and VALID_INCREMENT must also be set
# LEAD_SEQ is the list of forecast leads to process
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#timing-control
###
LOOP_BY = INIT
INIT_TIME_FMT = %Y%m%d%H%M
INIT_BEG=201608150000
INIT_END=201608150000
INIT_INCREMENT=06H
LEAD_SEQ = 12H
###
# File I/O
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#directory-and-filename-template-info
###
FCST_ENSEMBLE_STAT_INPUT_DATATYPE = PYTHON_NUMPY
FCST_ENSEMBLE_STAT_INPUT_DIR =
FCST_ENSEMBLE_STAT_INPUT_TEMPLATE = 0, 1, 2, 3, 4, 5, 6
OBS_ENSEMBLE_STAT_INPUT_GRID_DATATYPE = PYTHON_NUMPY
OBS_ENSEMBLE_STAT_GRID_INPUT_DIR = {INPUT_BASE}/model_applications/air_quality_and_comp/aod
OBS_ENSEMBLE_STAT_GRID_INPUT_TEMPLATE = PYTHON_NUMPY
ENSEMBLE_STAT_OUTPUT_DIR = {OUTPUT_BASE}
###
# Field Info
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#field-info
###
CONFIG_DIR = {PARM_BASE}/use_cases/model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod
FCST_VAR1_NAME = {CONFIG_DIR}/forecast_embedded.py {OBS_ENSEMBLE_STAT_GRID_INPUT_DIR}/icap_{init?fmt=%Y%m%d%H}_aod.nc:total_aod:{valid?fmt=%Y%m%d%H%M}:MET_PYTHON_INPUT_ARG
OBS_VAR1_NAME = {CONFIG_DIR}/analysis_embedded.py {OBS_ENSEMBLE_STAT_GRID_INPUT_DIR}/AGGR_HOURLY_{valid?fmt=%Y%m%d}T{valid?fmt=%H%M}_1deg_global_archive.nc:aod_nrl_total:Mean
###
# EnsembleStat Settings
# https://metplus.readthedocs.io/en/latest/Users_Guide/wrappers.html#ensemblestat
###
MODEL = ICAP
OBTYPE = NRL_AOD
ENSEMBLE_STAT_OBS_WINDOW_BEGIN = -5400
ENSEMBLE_STAT_OBS_WINDOW_END = 5400
ENSEMBLE_STAT_N_MEMBERS = 7
ENSEMBLE_STAT_ENS_THRESH = 0.1
ENSEMBLE_STAT_REGRID_TO_GRID = NONE
ENSEMBLE_STAT_OUTPUT_PREFIX =
MET Configuration
METplus sets environment variables based on user settings in the METplus configuration file. See How METplus controls MET config file settings for more details.
YOU SHOULD NOT SET ANY OF THESE ENVIRONMENT VARIABLES YOURSELF! THEY WILL BE OVERWRITTEN BY METPLUS WHEN IT CALLS THE MET TOOLS!
If there is a setting in the MET configuration file that is currently not supported by METplus you’d like to control, please refer to: Overriding Unsupported MET config file settings
Note
See the EnsembleStat MET Configuration section of the User’s Guide for more information on the environment variables used in the file below:
////////////////////////////////////////////////////////////////////////////////
//
// Ensemble-Stat configuration file.
//
// For additional information, see the MET_BASE/config/README file.
//
////////////////////////////////////////////////////////////////////////////////
//
// Output model name to be written
//
${METPLUS_MODEL}
//
// Output description to be written
// May be set separately in each "obs.field" entry
//
${METPLUS_DESC}
//
// Output observation type to be written
//
${METPLUS_OBTYPE}
////////////////////////////////////////////////////////////////////////////////
//
// Verification grid
//
${METPLUS_REGRID_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// May be set separately in each "field" entry
//
${METPLUS_CENSOR_THRESH}
${METPLUS_CENSOR_VAL}
cat_thresh = [];
nc_var_str = "";
//ens_member_ids =
${METPLUS_ENS_MEMBER_IDS}
//control_id =
${METPLUS_CONTROL_ID}
////////////////////////////////////////////////////////////////////////////////
//prob_cat_thresh =
${METPLUS_PROB_CAT_THRESH}
//prob_pct_thresh =
${METPLUS_PROB_PCT_THRESH}
//eclv_points =
${METPLUS_ECLV_POINTS}
////////////////////////////////////////////////////////////////////////////////
//
// Forecast and observation fields to be verified
//
fcst = {
${METPLUS_FCST_FILE_TYPE}
${METPLUS_ENS_THRESH}
${METPLUS_VLD_THRESH}
${METPLUS_FCST_FIELD}
}
obs = {
${METPLUS_OBS_FILE_TYPE}
${METPLUS_OBS_FIELD}
}
////////////////////////////////////////////////////////////////////////////////
//
// Point observation filtering options
// May be set separately in each "obs.field" entry
//
${METPLUS_MESSAGE_TYPE}
sid_exc = [];
//obs_thresh = [ NA ];
${METPLUS_OBS_THRESH}
//obs_quality_inc =
${METPLUS_OBS_QUALITY_INC}
//obs_quality_exc =
${METPLUS_OBS_QUALITY_EXC}
${METPLUS_DUPLICATE_FLAG}
obs_summary = NONE;
obs_perc_value = 50;
${METPLUS_SKIP_CONST}
//
// Observation error options
// Set dist_type to NONE to use the observation error table instead
// May be set separately in each "obs.field" entry
//
obs_error = {
${METPLUS_OBS_ERROR_FLAG}
dist_type = NONE;
dist_parm = [];
inst_bias_scale = 1.0;
inst_bias_offset = 0.0;
min = NA; // Valid range of data
max = NA;
}
//
// Mapping of message type group name to comma-separated list of values.
//
message_type_group_map = [
{ key = "SURFACE"; val = "ADPSFC,SFCSHP,MSONET"; },
{ key = "ANYAIR"; val = "AIRCAR,AIRCFT"; },
{ key = "ANYSFC"; val = "ADPSFC,SFCSHP,ADPUPA,PROFLR,MSONET"; },
{ key = "ONLYSF"; val = "ADPSFC,SFCSHP"; }
];
//
// Ensemble bin sizes
// May be set separately in each "obs.field" entry
//
${METPLUS_ENS_SSVAR_BIN_SIZE}
${METPLUS_ENS_PHIST_BIN_SIZE}
////////////////////////////////////////////////////////////////////////////////
//
// Climatology data
//
//climo_mean = {
${METPLUS_CLIMO_MEAN_DICT}
//climo_stdev = {
${METPLUS_CLIMO_STDEV_DICT}
//
// May be set separately in each "obs.field" entry
//
${METPLUS_CLIMO_CDF_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// Point observation time window
//
${METPLUS_OBS_WINDOW_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// Verification masking regions
//
mask = {
${METPLUS_MASK_GRID}
${METPLUS_MASK_POLY}
sid = [];
llpnt = [];
}
////////////////////////////////////////////////////////////////////////////////
//
// Confidence interval settings
//
${METPLUS_CI_ALPHA}
////////////////////////////////////////////////////////////////////////////////
//
// Interpolation methods
//
${METPLUS_INTERP_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// Statistical output types
//
${METPLUS_OUTPUT_FLAG_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// Gridded verification output types
// May be set separately in each "obs.field" entry
//
${METPLUS_NC_ORANK_FLAG_DICT}
////////////////////////////////////////////////////////////////////////////////
//
// Random number generator
//
rng = {
type = "mt19937";
seed = "1";
}
////////////////////////////////////////////////////////////////////////////////
//grid_weight_flag =
${METPLUS_GRID_WEIGHT_FLAG}
${METPLUS_OUTPUT_PREFIX}
//version = "V9.0";
////////////////////////////////////////////////////////////////////////////////
tmp_dir = "${MET_TMP_DIR}";
${METPLUS_MET_CONFIG_OVERRIDES}
Python Embedding
This use case uses two Python embedding scripts to read input data
parm/use_cases/model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod/forecast_embedded.py
import sys
import re
import numpy as np
import datetime as dt
from netCDF4 import Dataset, chartostring
#grab input from user
#should be (1)input file using full path (2) variable name (3) valid time for the forecast in %Y%m%d%H%M format and (4) ensemble member number, all separated by ':' characters
#program can only accept that 1 input, while still maintaining user flexability to change multiple
#variables, including valid time, ens member, etc.
input_file, var_name, val_time, ens_mem = sys.argv[1].split(':')
ens_mem = int(ens_mem)
val_time = dt.datetime.strptime(val_time,"%Y%m%d%H%M")
try:
#set pointers to file and group name in file
f = Dataset(input_file, 'r')
v = f[var_name]
#grab intialization time from file name and hold
#also compute the lead time
i_time_ind = input_file.split("_").index("aod.nc")-1
i_time = input_file.split("_")[i_time_ind]
i_time_obj = dt.datetime.strptime(i_time,"%Y%m%d%H")
lead, rem = divmod((val_time - i_time_obj).total_seconds(), 3600)
print("Ensemble Member evaluation for: "+f.members.split(',')[ens_mem])
#checks if the the valid time for the forecast from user is present in file.
#Exits if the time is not present with a message
if not val_time.timestamp() in f['time'][:]:
print("valid time of "+str(val_time)+" is not present. Check file initialization time, passed valid time.")
f.close()
sys.exit(1)
#grab index in the time array for the valid time provided by user (val_time)
val_time_ind = np.where(f['time'][:] == val_time.timestamp())[0][0]
#grab data from file
lat = np.float64(f.variables['lat'][:])
lon = np.float64(f.variables['lon'][:])
var = np.float64(v[val_time_ind:val_time_ind+1,ens_mem:ens_mem+1,::-1,:])
var[var < -800] = -9999
#squeeze out all 1d arrays, add fill value
met_data = np.squeeze(var).copy()
except NameError:
print("Can't find input file")
sys.exit(1)
##########
#create a metadata dictionary
attrs = {
'valid': str(val_time.strftime("%Y%m%d"))+'_'+str(val_time.strftime("%H%M%S")),
'init': i_time[:-2]+'_'+i_time[-2:]+'0000',
'name': var_name,
'long_name': 'UNKNOWN',
'lead': str(int(lead)),
'accum': '00',
'level': 'UNKNOWN',
'units': 'UNKNOWN',
'grid': {
'name': 'Global 1 degree',
'type': 'LatLon',
'lat_ll': -89.5,
'lon_ll': -179.5,
'delta_lat': 1.0,
'delta_lon': 1.0,
'Nlon': f.dimensions['lon'].size,
'Nlat': f.dimensions['lat'].size,
}
}
#print some output to show script ran successfully
print("Input file: " + repr(input_file))
print("Variable name: " + repr(var_name))
print("valid time: " + repr(val_time.strftime("%Y%m%d%H%M")))
print("Attributes:\t" + repr(attrs))
f.close()
parm/use_cases/model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod/analysis_embedded.py
import sys
import re
import numpy as np
import datetime as dt
from netCDF4 import Dataset, chartostring
#grab input from user
#should be (1)input file using full path (2) group name for the variable and (3) variable name
input_file, group_name, var_name = sys.argv[1].split(':')
try:
#set pointers to file and group name in file
f = Dataset(input_file, 'r')
g = f.groups[group_name]
#grab time from file name and hold
v_time_ind = input_file.split("_").index("HOURLY")+1
v_time = input_file.split("_")[v_time_ind]
#grab data from file
lat = np.float64(f.variables['latitude'][:])
lon = np.float64(f.variables['longitude'][:])
#the data is defined by (lon, lat), so it needs to be transposed
#in addition to being filled by fill value if data is missing
var_invert = np.float64(g.variables[var_name][:,::-1])
var_invert[var_invert < -800] = -9999
met_data = var_invert.T.copy()
except NameError:
print("Can't find input file")
sys.exit(1)
##########
#create a metadata dictionary
attrs = {
'valid': str(v_time.split('T')[0])+'_'+str(v_time.split('T')[1])+'00',
'init': str(v_time.split('T')[0])+'_'+str(v_time.split('T')[1])+'00',
'name': group_name+'_'+var_name,
'long_name': 'UNKNOWN',
'lead': '00',
'accum': '00',
'level': 'UNKNOWN',
'units': 'UNKNOWN',
'grid': {
'name': 'Global 1 degree',
'type': 'LatLon',
'lat_ll': -89.5,
'lon_ll': -179.5,
'delta_lat': 1.0,
'delta_lon': 1.0,
'Nlon': f.dimensions['longitude'].size,
'Nlat': f.dimensions['latitude'].size,
}
}
#print some output to show script ran successfully
print("Input file: " + repr(input_file))
print("Group name: " + repr(group_name))
print("Variable name: " + repr(var_name))
print("Attributes:\t" + repr(attrs))
f.close()
Running METplus
It is recommended to run this use case by:
Passing in EnsembleStat_python_embedding.conf then a user-specific system configuration file:
run_metplus.py -c /path/to/METplus/parm/use_cases/model_applications/air_quality_and_comp/EnsembleStat_fcstICAP_obsMODIS_aod.conf -c /path/to/user_system.conf
The following METplus configuration variables must be set correctly to run this example.:
INPUT_BASE - Path to directory where sample data tarballs are unpacked (See Datasets section to obtain tarballs).
OUTPUT_BASE - Path where METplus output will be written. This must be in a location where you have write permissions
MET_INSTALL_DIR - Path to location where MET is installed locally
Example User Configuration File:
[dir]
INPUT_BASE = /path/to/sample/input/data
OUTPUT_BASE = /path/to/output/dir
MET_INSTALL_DIR = /path/to/met-X.Y
NOTE: All of these items must be found under the [dir] section.
Expected Output
A successful run will output the following both to the screen and to the logfile:
INFO: METplus has successfully finished running.
Refer to the value set for OUTPUT_BASE to find where the output data was generated. Output for this use case will be found in model_applications/air_quality/AOD (relative to OUTPUT_BASE) and will contain the following files:
ensemble_stat_aod_20160815_120000V_ecnt.txt
ensemble_stat_aod_20160815_120000V_ens.nc
ensemble_stat_aod_20160815_120000V_orank.nc
ensemble_stat_aod_20160815_120000V_phist.txt
ensemble_stat_aod_20160815_120000V_relp.txt
ensemble_stat_aod_20160815_120000V_rhist.txt
ensemble_stat_aod_20160815_120000V_ssvar.txt
ensemble_stat_aod_20160815_120000V.stat
Keywords
Note
EnsembleStatToolUseCase
PythonEmbeddingFileUseCase
AirQualityAndCompAppUseCase
PythonEmbeddingFileUseCase
Navigate to the METplus Quick Search for Use Cases page to discover other similar use cases.
sphinx_gallery_thumbnail_path = ‘_static/air_quality_and_comp-EnsembleStat_fcstICAP_obsMODIS_aod.png’
Total running time of the script: (0 minutes 0.000 seconds)