Grid-Stat and Series-Analysis: BMKG APIK Seasonal Forecast

model_applications/s2s/GridStat_SeriesAnalysis _fcstNMME_obsCPC _seasonal_forecast.conf

Scientific Objective

The process of seasonal forecasting with a time horizon of one to many months (typically 6 to 9 months) poses new challenges to tools primarily developed for weather forecasting that cover a few days. These challenges include two aspects in particular: (1) a dramatically expanded time variable, and (2) a verification that is by design backward oriented using extensive hindcasts over past decades rather than the rapid verification possible in short-range weather forecasting. Therefore, the scientific objective of the seasonal forecast usecase involves the expansion of options to describe time as well as the strategic selection of hindcasts.

Time:

Commonly METplus expresses time intervals in the minutes, hours, and days. Month and year intervals were not supported since there is not a constant length for these units. Therefore, modifications to METplus were made to support these intervals by determining the offset relative to a given time.

Input data:

The input data from seasonal forecasts is generally based on daily, weekly, decadal (10-day), monthly or seasonal time integrated intervals. The time variable therefore is often no longer a simple snapshot of the system but rather representing an average, a sum (precipitation), or a particular statistic (maximum wind, minimum temperature, wind variability) over the integration time period. This requires some adjustment from the traditional approach in forecast verification where forecast time (“valid-time”) is simply a snapshot out of a continuous run.

Hindcasts:

The objective of seasonal forecasts is no longer the exact location and intensity of one particular weather event, such as a storm, a frontal passage, or high wind conditions. Rather, seasonal forecasting focuses more on the statistical properties over a period of time, be it a 10-day interval, a month, or even a three month season. The verification of a new, forward looking seasonal forecast requires assessments of the forecast systems ability to appropriately forecast that longrange behavior of the weather (here, only atmospheric verification is considered, but the same concept would apply ocean or any other longrange forecast system). Because weather properties commonly change significantly over the course of the season, samples to verify the prognostic system can not be taken from the immediate days, weeks of months before the forecast. Hindcasting in the seasonal context requires a complete set of forecasts based on the same season but during past years. A current July-1 2019 forecast, therefore requires many July-1 forecasts for as many years in the past as possible, given that the forecast system is the same as the one used for the current forecast cycle into the future. Operational centers offer hindcasts, also sometimes called “re-forecasts”, with the current, most up-to-date forecast system. MET and METplus therefore need to be able to extract the appropriate collection of past forecasts. This includes the identification of the same Julian-day-of-Year init-dates from forecasts cycles from past years, and then identify the different lead-times of interest generally ranging from one to 6 or more months.

Verification:

The verification steps can then utilize the existing collection of verification tools. In comparison to weather forecasts, the only difference is that the data, as stated above, are not snapshots but time-integrated values (averages, sums, statistics) that are representing a whole period of time. The verification then focuses on comparisons of these derivatives of the forecast simulations. In practice, a further step might be added prior to, or as a key step during verification: the formation of anomalies of the forecasts compared to long-term expected averages. A rainfall forecasts can therefore be verified in both absolute as well as anomaly context where some analyses might focus on extreme rainfall threshold exeedance of, for example, 500mm per month. At the same time, the same forecast might be verified for the 3 months rainfall average in comparison with the long-term expected mean. The verification might then assess how well the system can foresee the occurrence of below average rainfall over the season, and possibly some selected thresholds there (e.g., ability to forecast mean seasonal rainfall below the 10-th percentile of seasonal rainfall). Finally, flexibility in formulating forecast verification strategies is important as forecast skill might vary by location, the timing within the seasonal cycle, or the state of the evolving coupled system (the rapid onset of a strong El Nino will lead to significantly different forecast skill compared to a neutral state in the Pacific). Memory from past months, for example when considering accumulated soil moisture, might also influence the forecast skills. Seasonal forecast verification therefore requires understanding of the climate system; MET and METplus then need to offer the flexibility to tailor verification strategies and to potentially craft conditional approaches.

Overall, seasonal forecasts don’t require a new verification approach. It does however put demands on the flexibility of dealing with a significantly exapanded range of the time variable as well as logistic infrastructure to select appropriate hindcast samples from long hindcast or re-forecast archives. Scientifically, the challenges are mostly restricted in the appropriate formulation of verificaation questions that address specific forecast objectives. Compared to weather forecastsing, seasonal forecasts need to draw their skill from slowly changing components in the coupled Earth system while acknowledging the high-frequency noise of weather superposed on these ‘climatologically’ evolving background conditions. In many regions of the world, the noise might dominate that background climate and forecast skill is low. It is therefore the task of seasonal forecast verification to identify where there is actually skill for particular properties of the forecasts over a wide range of lead-times. The skill might be dependent on location, on the timing within the seasonal cycle, or even on the evolving state of the coupled system.

Datasets

All datasets are traditionally in netCDF format. Grids are either regular gaussian Latitude/Longitude grids or they are Lambert-conformal WRF grids.

The forecast datasets contain weekly, monthly or seasonally integrated data. Here, the time format of the use-case is monthly. Since the verification is done on the hindcasts rather than the forecast (would require another 6 months of waiting), the key identification here is the month of initialization and then the lead-time of the forecast of interest.

The hindcast data, the ‘observational’ data that is to be compared to the forecast, is a collection of datasets formatted in equivalent format to the forecast. The hindcast ensemble is identified through the year in the filename (as well as in the time variable inside the netCDF file).

Forecast Datasets:

NMME * variable of interest: pr (precipitation: cumulative monthly sum) * format of precipitation variable: time,lat,lon (here dimensions: 29,181,361) with time variable representing 29 samples of same Julian Init-Time of hindcasts over past 29 years.

Hindcast Datasets:

Observational Dataset:

  • CPC precipitation reference data (same format and grid)

METplus Components

This use case loops over initialization years and processes forecast lead months with GridStat It also processes the output of GridStat using two calls to SeriesAnalysis.

External Dependencies

You will need to use a version of Python 3.6+ that has the following packages installed:

  • netCDF4

METplus Workflow

The following tools are used for each run time: GridStat

This example loops by initialization time. Each initialization time is July of each year from 1982 to 2010. For each init time it will run once, processing forecast leads 1 month through 5 months. The following times are processed:

Run times:

Init: 1982-07
Forecast leads: 1 month, 2 months, 3 months, 4 months, 5 months

Init: 1983-07
Forecast leads: 1 month, 2 months, 3 months, 4 months, 5 months

Init: 1984-07
Forecast leads: 1 month, 2 months, 3 months, 4 months, 5 months

Init: 1985-07
Forecast leads: 1 month, 2 months, 3 months, 4 months, 5 months


Init: 2009-07
Forecast leads: 1 month, 2 months, 3 months, 4 months, 5 months

Init: 2010-07
Forecast leads: 1 month, 2 months, 3 months, 4 months, 5 months

METplus Configuration

[config]

# Documentation for this use case can be found at
# https://metplus.readthedocs.io/en/latest/generated/model_applications/s2s/GridStat_SeriesAnalysis_fcstNMME_obsCPC_seasonal_forecast.html

# For additional information, please see the METplus Users Guide.
# https://metplus.readthedocs.io/en/latest/Users_Guide

###
# Processes to run
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#process-list
###

PROCESS_LIST = GridStat, SeriesAnalysis(climo), SeriesAnalysis(full_stats)


###
# Time Info
# LOOP_BY options are INIT, VALID, RETRO, and REALTIME
# If set to INIT or RETRO:
#   INIT_TIME_FMT, INIT_BEG, INIT_END, and INIT_INCREMENT must also be set
# If set to VALID or REALTIME:
#   VALID_TIME_FMT, VALID_BEG, VALID_END, and VALID_INCREMENT must also be set
# LEAD_SEQ is the list of forecast leads to process
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#timing-control
###

LOOP_BY = INIT
INIT_TIME_FMT = %Y%m
INIT_BEG = 198207
INIT_END = 201007
INIT_INCREMENT = 1Y

LEAD_SEQ = 1m, 2m, 3m, 4m, 5m, 6m

SERIES_ANALYSIS_RUNTIME_FREQ = RUN_ONCE_PER_LEAD


###
# File I/O
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#directory-and-filename-template-info
###

FCST_GRID_STAT_INPUT_DIR = {INPUT_BASE}/model_applications/s2s/NMME/hindcast/monthly
FCST_GRID_STAT_INPUT_TEMPLATE = nmme_pr_hcst_{init?fmt=%b}IC_{valid?fmt=%m}_*.nc

OBS_GRID_STAT_INPUT_DIR = {INPUT_BASE}/model_applications/s2s/NMME/obs
OBS_GRID_STAT_INPUT_TEMPLATE = obs_cpc_pp.1x1.nc

GRID_STAT_OUTPUT_DIR = {OUTPUT_BASE}/model_applications/s2s/GridStat_SeriesAnalysis_fcstNMME_obsCPC_seasonal_forecast/GridStat

BOTH_SERIES_ANALYSIS_INPUT_DIR = {GRID_STAT_OUTPUT_DIR}
BOTH_SERIES_ANALYSIS_INPUT_TEMPLATE = grid_stat_{MODEL}-hindcast_precip_vs_{OBTYPE}_IC{init?fmt=%Y%b}_V{valid?fmt=%Y%m}01_*pairs.nc

SERIES_ANALYSIS_OUTPUT_DIR = {OUTPUT_BASE}/model_applications/s2s/GridStat_SeriesAnalysis_fcstNMME_obsCPC_seasonal_forecast/SeriesAnalysis
SERIES_ANALYSIS_OUTPUT_TEMPLATE = series_analysis_{MODEL}_{OBTYPE}_stats_F{lead?fmt=%2m}_{instance?fmt=%s}.nc

[full_stats]

SERIES_ANALYSIS_CLIMO_MEAN_INPUT_DIR = {SERIES_ANALYSIS_OUTPUT_DIR}
SERIES_ANALYSIS_CLIMO_MEAN_INPUT_TEMPLATE = series_analysis_{MODEL}_{OBTYPE}_stats_F{lead?fmt=%2m}_climo.nc

[config]


###
# Field Info
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#field-info
###

MODEL = NMME
OBTYPE = CPC

FCST_GRID_STAT_VAR1_NAME = pr
FCST_GRID_STAT_VAR1_LEVELS = "({valid?fmt=%Y%m01_000000},*,*)"
FCST_GRID_STAT_VAR1_THRESH = >0, >50, >100, >150, >200, >250, >300, >400, >500

OBS_GRID_STAT_VAR1_NAME = precip
OBS_GRID_STAT_VAR1_LEVELS = "({valid?fmt=%Y%m01_000000},*,*)"
OBS_GRID_STAT_VAR1_THRESH = >0, >50, >100, >150, >200, >250, >300, >400, >500

FCST_SERIES_ANALYSIS_VAR1_NAME = FCST_precip_FULL
FCST_SERIES_ANALYSIS_VAR1_LEVELS = "(*,*)"

OBS_SERIES_ANALYSIS_VAR1_NAME = OBS_precip_FULL
OBS_SERIES_ANALYSIS_VAR1_LEVELS = "(*,*)"


###
# GridStat Settings
# https://metplus.readthedocs.io/en/latest/Users_Guide/wrappers.html#gridstat
###

GRID_STAT_OUTPUT_FLAG_CTC = STAT
GRID_STAT_OUTPUT_FLAG_CNT = STAT
GRID_STAT_OUTPUT_FLAG_SL1L2 = STAT

GRID_STAT_NC_PAIRS_FLAG_APPLY_MASK = FALSE

GRID_STAT_NC_PAIRS_VAR_NAME = precip

GRID_STAT_OUTPUT_PREFIX = {MODEL}-hindcast_{CURRENT_OBS_NAME}_vs_{OBTYPE}_IC{init?fmt=%Y%b}_V{valid?fmt=%Y%m%d}

###
# SeriesAnalysis Settings
# https://metplus.readthedocs.io/en/latest/Users_Guide/wrappers.html#seriesanalysis
###

SERIES_ANALYSIS_DESC = hindcast

SERIES_ANALYSIS_CAT_THRESH = >=50, >=100, >=150, >=200, >=250, >=300, >=400, >=500

SERIES_ANALYSIS_VLD_THRESH = 0.50

SERIES_ANALYSIS_BLOCK_SIZE = 360*181

SERIES_ANALYSIS_IS_PAIRED = False

SERIES_ANALYSIS_GENERATE_PLOTS = no
SERIES_ANALYSIS_GENERATE_ANIMATIONS = no

SERIES_ANALYSIS_RUN_ONCE_PER_STORM_ID = False


SERIES_ANALYSIS_STAT_LIST = OBAR

[full_stats]

SERIES_ANALYSIS_STAT_LIST =TOTAL, FBAR, OBAR, ME, MAE, RMSE, ANOM_CORR, PR_CORR
SERIES_ANALYSIS_CTS_LIST = BASER, CSI, GSS

MET Configuration

METplus sets environment variables based on user settings in the METplus configuration file. See How METplus controls MET config file settings for more details.

YOU SHOULD NOT SET ANY OF THESE ENVIRONMENT VARIABLES YOURSELF! THEY WILL BE OVERWRITTEN BY METPLUS WHEN IT CALLS THE MET TOOLS!

If there is a setting in the MET configuration file that is currently not supported by METplus you’d like to control, please refer to: Overriding Unsupported MET config file settings

GridStatConfig_wrapped

Note

See the GridStat MET Configuration section of the User’s Guide for more information on the environment variables used in the file below:

////////////////////////////////////////////////////////////////////////////////
//
// Grid-Stat configuration file.
//
// For additional information, see the MET_BASE/config/README file.
//
////////////////////////////////////////////////////////////////////////////////

//
// Output model name to be written
//
// model =
${METPLUS_MODEL}

//
// Output description to be written
// May be set separately in each "obs.field" entry
//
// desc =
${METPLUS_DESC}

//
// Output observation type to be written
//
// obtype =
${METPLUS_OBTYPE}

////////////////////////////////////////////////////////////////////////////////

//
// Verification grid
//
// regrid = {
${METPLUS_REGRID_DICT}

////////////////////////////////////////////////////////////////////////////////

//censor_thresh =
${METPLUS_CENSOR_THRESH}
//censor_val =
${METPLUS_CENSOR_VAL}
cat_thresh  	 = [];
cnt_thresh  	 = [ NA ];
cnt_logic   	 = UNION;
wind_thresh 	 = [ NA ];
wind_logic  	 = UNION;
eclv_points      = 0.05;
//nc_pairs_var_name =
${METPLUS_NC_PAIRS_VAR_NAME}
nc_pairs_var_suffix = "";
//hss_ec_value =
${METPLUS_HSS_EC_VALUE}

rank_corr_flag   = FALSE;

//
// Forecast and observation fields to be verified
//
fcst = {
  ${METPLUS_FCST_FILE_TYPE}
  ${METPLUS_FCST_FIELD}
}
obs = {
  ${METPLUS_OBS_FILE_TYPE}
  ${METPLUS_OBS_FIELD}
}

////////////////////////////////////////////////////////////////////////////////

//
// Climatology mean data
//
//climo_mean = {
${METPLUS_CLIMO_MEAN_DICT}


//climo_stdev = {
${METPLUS_CLIMO_STDEV_DICT}

//
// May be set separately in each "obs.field" entry
//
//climo_cdf = {
${METPLUS_CLIMO_CDF_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Verification masking regions
//
// mask = {
${METPLUS_MASK_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Confidence interval settings
//
ci_alpha  = [ 0.05 ];

boot = {
   interval = PCTILE;
   rep_prop = 1.0;
   n_rep    = 0;
   rng      = "mt19937";
   seed     = "";
}

////////////////////////////////////////////////////////////////////////////////

//
// Data smoothing methods
//
//interp = {
${METPLUS_INTERP_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Neighborhood methods
//
nbrhd = {
   field      = BOTH;
   // shape =
   ${METPLUS_NBRHD_SHAPE}
   // width =
   ${METPLUS_NBRHD_WIDTH}
   // cov_thresh =
   ${METPLUS_NBRHD_COV_THRESH}
   vld_thresh = 1.0;
}

////////////////////////////////////////////////////////////////////////////////

//
// Fourier decomposition
// May be set separately in each "obs.field" entry
//
//fourier = {
${METPLUS_FOURIER_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Gradient statistics
// May be set separately in each "obs.field" entry
//
gradient = {
   dx = [ 1 ];
   dy = [ 1 ];
}

////////////////////////////////////////////////////////////////////////////////

//
// Distance Map statistics
// May be set separately in each "obs.field" entry
//
//distance_map = {
${METPLUS_DISTANCE_MAP_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Statistical output types
//
//output_flag = {
${METPLUS_OUTPUT_FLAG_DICT}

//
// NetCDF matched pairs output file
// May be set separately in each "obs.field" entry
//
// nc_pairs_flag = {
${METPLUS_NC_PAIRS_FLAG_DICT}

////////////////////////////////////////////////////////////////////////////////
// Threshold for SEEPS p1 (Probability of being dry)

//seeps_p1_thresh =
${METPLUS_SEEPS_P1_THRESH}

////////////////////////////////////////////////////////////////////////////////

//grid_weight_flag =
${METPLUS_GRID_WEIGHT_FLAG}

tmp_dir = "${MET_TMP_DIR}";

// output_prefix =
${METPLUS_OUTPUT_PREFIX}

////////////////////////////////////////////////////////////////////////////////

${METPLUS_MET_CONFIG_OVERRIDES}

SeriesAnalysisConfig_wrapped

Note

See the SeriesAnalysis MET Configuration section of the User’s Guide for more information on the environment variables used in the file below:

////////////////////////////////////////////////////////////////////////////////
//
// Series-Analysis configuration file.
//
// For additional information, see the MET_BASE/config/README file.
//
////////////////////////////////////////////////////////////////////////////////

//
// Output model name to be written
//
//model =
${METPLUS_MODEL}

//
// Output description to be written
//
//desc =
${METPLUS_DESC}

//
// Output observation type to be written
//
//obtype =
${METPLUS_OBTYPE}

////////////////////////////////////////////////////////////////////////////////

//
// Verification grid
// May be set separately in each "field" entry
//
//regrid = {
${METPLUS_REGRID_DICT}

////////////////////////////////////////////////////////////////////////////////

censor_thresh = [];
censor_val    = [];
//cat_thresh =
${METPLUS_CAT_THRESH}
cnt_thresh    = [ NA ];
cnt_logic     = UNION;

//
// Forecast and observation fields to be verified
//
fcst = {
   ${METPLUS_FCST_FILE_TYPE}
   ${METPLUS_FCST_CAT_THRESH}
   ${METPLUS_FCST_FIELD}
}
obs = {
   ${METPLUS_OBS_FILE_TYPE}
   ${METPLUS_OBS_CAT_THRESH}
   ${METPLUS_OBS_FIELD}
}

////////////////////////////////////////////////////////////////////////////////

//
// Climatology data
//
//climo_mean = {
${METPLUS_CLIMO_MEAN_DICT}


//climo_stdev = {
${METPLUS_CLIMO_STDEV_DICT}

//climo_cdf = {
${METPLUS_CLIMO_CDF_DICT}

////////////////////////////////////////////////////////////////////////////////

//
// Confidence interval settings
//
ci_alpha  = [ 0.05 ];

boot = {
   interval = PCTILE;
   rep_prop = 1.0;
   n_rep    = 0;
   rng      = "mt19937";
   seed     = "";
}

////////////////////////////////////////////////////////////////////////////////

//
// Verification masking regions
//
//mask = {
${METPLUS_MASK_DICT}

//
// Number of grid points to be processed concurrently.  Set smaller to use
// less memory but increase the number of passes through the data.
//
//block_size =
${METPLUS_BLOCK_SIZE}

//
// Ratio of valid matched pairs to compute statistics for a grid point
//
//vld_thresh =
${METPLUS_VLD_THRESH}

////////////////////////////////////////////////////////////////////////////////

//
// Statistical output types
//
//output_stats = {
${METPLUS_OUTPUT_STATS_DICT}

////////////////////////////////////////////////////////////////////////////////

//hss_ec_value =
${METPLUS_HSS_EC_VALUE}
rank_corr_flag = FALSE;

tmp_dir = "${MET_TMP_DIR}";

//version        = "V10.0";

////////////////////////////////////////////////////////////////////////////////

${METPLUS_MET_CONFIG_OVERRIDES}

Running METplus

This use case can be run two ways:

  1. Passing in GridStat_SeriesAnalysis_fcstNMME_obsCPC_seasonal_forecast.conf then a user-specific system configuration file:

    run_metplus.py -c /path/to/METplus/parm/use_cases/model_applications/s2s/GridStat_SeriesAnalysis_fcstNMME_obsCPC_seasonal_forecast.conf -c /path/to/user_system.conf
    
  2. Modifying the configurations in parm/metplus_config, then passing in GridStat_SeriesAnalysis_fcstNMME_obsCPC_seasonal_forecast.conf:

    run_metplus.py -c /path/to/METplus/parm/use_cases/model_applications/s2s/GridStat_SeriesAnalysis_fcstNMME_obsCPC_seasonal_forecast.conf
    

The former method is recommended. Whether you add them to a user-specific configuration file or modify the metplus_config files, the following variables must be set correctly:

  • INPUT_BASE - Path to directory where sample data tarballs are unpacked (See Datasets section to obtain tarballs). This is not required to run METplus, but it is required to run the examples in parm/use_cases

  • OUTPUT_BASE - Path where METplus output will be written. This must be in a location where you have write permissions

  • MET_INSTALL_DIR - Path to location where MET is installed locally

Example User Configuration File:

[dir]
INPUT_BASE = /path/to/sample/input/data
OUTPUT_BASE = /path/to/output/dir
MET_INSTALL_DIR = /path/to/met-X.Y

NOTE: All of these items must be found under the [dir] section.

Expected Output

A successful run will output the following both to the screen and to the logfile:

INFO: METplus has successfully finished running.

Refer to the value set for OUTPUT_BASE to find where the output data was generated. Output for this use case will be found in model_applications/s2s/GridStat_SeriesAnalysis_fcstNMME_obsCPC_seasonal_forecast/GridStat (relative to OUTPUT_BASE)

For each month and year there will be two files written:

* grid_stat_NMME-hindcast_precip_vs_CPC_IC{%Y%b}01_2301360000L_20081001_000000V.stat
* grid_stat_NMME-hindcast_precip_vs_CPC_IC{%Y%b}01_2301360000L_20081001_000000V_pairs.nc

Output from SeriesAnalysis will be found in model_applications/s2s/GridStat_SeriesAnalysis_fcstNMME_obsCPC_seasonal_forecast/SeriesAnalysis (relative to OUTPUT_BASE)

For each month there will be two files written:

* series_analysis_NMME_CPC_stats_ICJul_{%m}_climo.nc
* series_analysis_NMME_CPC_stats_ICJul_{%m}_full_stats.nc

Keywords

Note

  • GridStatToolUseCase

  • SeriesAnalysisUseCase

  • NetCDFFileUseCase

  • LoopByMonthFeatureUseCase

  • NCAROrgUseCase

  • RuntimeFreqUseCase

  • ClimatologyUseCase

Navigate to the METplus Quick Search for Use Cases page to discover other similar use cases.

sphinx_gallery_thumbnail_path = ‘_static/s2s-GridStat_SeriesAnalysis_fcstNMME_obsCPC_seasonal_forecast.png’

Total running time of the script: (0 minutes 0.000 seconds)

Gallery generated by Sphinx-Gallery