UserScript: Python Script to compute cable transport

model_applications/marine_and_cryosphere/UserScript_fcstRTOFS_obsAOML_calcTransport.conf

Scientific Objective

The Florida Current flows northward along the eastern Florida coast and feeds to the Gulf Stream. More info can be obtained from: https://www.aoml.noaa.gov/phod/floridacurrent/index.php

This use case utilizes a Python script to calculate transport (units Sv) variations of the Florida current using a submarine cable and snapshot estimates made by shipboard instruments. The code compares the transport using RTOFS data and compare it with the AOML cable transport data and computes BIAS, RMSE, CORRELATION, and Scatter Index. The operational code utilizes 21 days of data and computes 7 day statistics. For the use case 3 days of data are utilized. The valid date is passed though an argument. The valid date is the last processed day i.e. the code grabs 3 previous days of data.

Datasets

Forecast: RTOFS u(3zuio) amd ,v(3zvio) files via Python Embedding script/file
Observations: AOML Florida Current data via Python Embedding script/file
Location: All of the input data required for this use case can be found in the met_test sample data tarball. Click here to the METplus releases page and download sample data for the appropriate release: https://github.com/dtcenter/METplus/releases
This tarball should be unpacked into the directory that you will set the value of INPUT_BASE. See Running METplus section for more information.
Data Source: NOMADS RTOFS Global + Daily mean transport (https://www.aoml.noaa.gov/phod/floridacurrent/data_access.php)+ Eightmilecable (static, provided with the use case)

External Dependencies

You will need to use a version of Python 3.6+ that has the following packages installed:

  • scikit-learn

  • pyproj

If the version of Python used to compile MET did not have these libraries at the time of compilation, you will need to add these packages or create a new Python environment with these packages.

If this is the case, you will need to set the MET_PYTHON_EXE environment variable to the path of the version of Python you want to use. If you want this version of Python to only apply to this use case, set it in the [user_env_vars] section of a METplus configuration file.:

[user_env_vars]
MET_PYTHON_EXE = /path/to/python/with/required/packages/bin/python

METplus Components

This use case utilizes the METplus UserScript wrapper to generate a command to run with Python Embedding for the specified valid time.

METplus Workflow

This use case uses UserScript. All the gridded data being pulled from the files via Python Embedding. All of the desired statistics are in the log file. It processes the following run time:

Valid: 2021-10-28

The code grabs the 20211028, 20211027, and 20211026 24 hour RTOFS files.

METplus Configuration

METplus first loads all of the configuration files found in parm/metplus_config, then it loads any configuration files passed to METplus via the command line i.e. -c parm/use_cases/model_applications/marine_and_cryosphere/UserScript_fcstRTOFS_obsAOML_calcTransport.conf

[config]

# Documentation for this use case can be found at
# https://metplus.readthedocs.io/en/latest/generated/model_applications/marine_and_cryosphere/UserScript_fcstRTOFS_obsAOML_calcTransport.html

# For additional information, please see the METplus Users Guide.
# https://metplus.readthedocs.io/en/latest/Users_Guide

###
# Processes to run
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#process-list
###

PROCESS_LIST = UserScript

###
# Time Info
# LOOP_BY options are INIT, VALID, RETRO, and REALTIME
# If set to INIT or RETRO:
#   INIT_TIME_FMT, INIT_BEG, INIT_END, and INIT_INCREMENT must also be set
# If set to VALID or REALTIME:
#   VALID_TIME_FMT, VALID_BEG, VALID_END, and VALID_INCREMENT must also be set
# LEAD_SEQ is the list of forecast leads to process
# https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#timing-control
###

LOOP_BY = VALID
VALID_TIME_FMT = %Y%m%d
VALID_BEG = 20211028
VALID_INCREMENT = 24H

LEAD_SEQ =

USER_SCRIPT_RUNTIME_FREQ = RUN_ONCE


###
# UserScript Settings
# https://metplus.readthedocs.io/en/latest/Users_Guide/wrappers.html#userscript
###

USER_SCRIPT_INPUT_TEMPLATE = {VALID_BEG}

USER_SCRIPT_OUTPUT_DIR = {OUTPUT_BASE}/model_applications/marine_and_cryosphere/calc_transport

USER_SCRIPT_COMMAND = {PARM_BASE}/use_cases/model_applications/marine_and_cryosphere/UserScript_fcstRTOFS_obsAOML_calcTransport/read_aomlcable_rtofs_transport.py {USER_SCRIPT_INPUT_TEMPLATE}


[user_env_vars]

# Calc Transport specific variables

CALC_TRANSPORT_RTOFS_DIRNAME = {INPUT_BASE}/model_applications/marine_and_cryosphere/UserScript_fcstRTOFS_obsAOML_calcTransport/RTOFS

CALC_TRANSPORT_CABLE_FILENAME = {INPUT_BASE}/model_applications/marine_and_cryosphere/UserScript_fcstRTOFS_obsAOML_calcTransport/FC_cable_transport_2021.dat

CALC_TRANSPORT_EIGHTMILE_FILENAME = {INPUT_BASE}/model_applications/marine_and_cryosphere/UserScript_fcstRTOFS_obsAOML_calcTransport/eightmilecable.dat

CALC_TRANSPORT_LEAD_TIME = 24 

# Calculate stats for number of days. The operational website uses 21 days 
# of data and then calculates 7 day stats. For the use case both of them are 3 days each. 
# The code calculates the number of subdirectories 
# under RTOFS directory, however, CALC_TRANSPORT_STATS_DAY is the number of days the statistics 
# will be calculated. 
CALC_TRANSPORT_STATS_DAY = 3

CALC_TRANSPORT_LOG_FILE = calc_transport.log

OUTPUT_DIR = {USER_SCRIPT_OUTPUT_DIR}

MET Configuration

None. All of the processing is completed in the UserScript

User Script

This use case uses one Python script to read forecast and observation data as well as processing the desired statistics.

parm/use_cases/model_applications/marine_and_cryosphere/UserScript_fcstRTOFS_obsAOML_calcTransport/read_aomlcable_rtofs_transport.py

#! /usr/bin/env python3
"""
Florida Cable Transport Class-4 Validation System
Adapted from Todd Spindler's code
"""

from netCDF4 import Dataset
import numpy as np
from pyproj import Geod
import math
from sklearn.metrics import mean_squared_error
from datetime import datetime, timedelta
import pandas as pd
import sys, os
import logging

vDate=datetime.strptime(sys.argv[1],'%Y%m%d')
rtofsdir = os.environ.get('CALC_TRANSPORT_RTOFS_DIRNAME') 
cablefile = os.environ.get('CALC_TRANSPORT_CABLE_FILENAME') 
eightmilefile = os.environ.get('CALC_TRANSPORT_EIGHTMILE_FILENAME') 

print('Starting Cable V&V at',datetime.now(),'for',vDate)



if not os.path.exists(cablefile):
        print('missing AOML Cable transport file for',vDate)

#-----------------------------------------------
# read cable transport data from AOML            
#-----------------------------------------------
    
# read the AOML dataset
names=['year','month','day','transport']
cable=pd.read_csv(cablefile,comment='%',names=names,delimiter=' ',
    skipinitialspace=True,header=None,usecols=list(range(4)))
cable['date']=pd.to_datetime(cable[['year','month','day']])
cable.index=cable.date
cable['error']=2.0
del cable['year'], cable['month'], cable['day'], cable['date']
print(cable)

#-----------------------------------------------
# full cross-section transport calculation
#-----------------------------------------------
def calc_transport(dates,fcst):
    """
    Calculate the transport of water across the Florida Straits
    This extracts the section and integrates the flow through it.
    """
    transport=[]
    fcst_str='f{:03d}'.format(fcst)
    cable_loc=np.loadtxt(eightmilefile,dtype='int',usecols=(0,1))
    eightmile_lat = 26.5167
    eightmile_lon = -78.7833%360
    wpb_lat = 26.7153425
    wpb_lon = -80.0533746%360
    cable_angle = math.atan((eightmile_lat-wpb_lat)/(eightmile_lon-wpb_lon))
    g=Geod(ellps='WGS84')

    for date in dates:
        print('DATE :', date, ' DATES :',dates)
        print('processing',date.strftime('%Y%m%d'),'fcst',fcst)
        rundate=date-timedelta(fcst/24.)  # calc rundate from fcst and date
        ufile=rtofsdir+'/'+rundate.strftime('%Y%m%d')+'/rtofs_glo_3dz_'+fcst_str+'_daily_3zuio.nc'
        vfile=rtofsdir+'/'+rundate.strftime('%Y%m%d')+'/rtofs_glo_3dz_'+fcst_str+'_daily_3zvio.nc'

        print(ufile)
        print(vfile)

        udata=Dataset(ufile)
        vdata=Dataset(vfile)

        lon=udata['Longitude'][:]
        lat=udata['Latitude'][:]
        depth=udata['Depth'][:]

        usection=np.zeros((depth.shape[0],cable_loc.shape[0]))
        vsection=np.zeros((depth.shape[0],cable_loc.shape[0]))

        udata=udata['u'][:].squeeze()
        vdata=vdata['v'][:].squeeze()

        for ncol,(row,col) in enumerate(cable_loc):
            usection[:,ncol]=udata[:,row,col].filled(fill_value=0.0)
            vsection[:,ncol]=vdata[:,row,col].filled(fill_value=0.0)

        lon=lon[cable_loc[:,0],cable_loc[:,1]]
        lat=lat[cable_loc[:,0],cable_loc[:,1]]

        # compute the distances along the track
        _,_,dist=g.inv(lon[0:-1],lat[0:-1],lon[1:],lat[1:])
        depth=np.diff(depth)
        usection=usection[:-1,:-1]
        vsection=vsection[:-1,:-1]

        dist,depth=np.meshgrid(dist,depth)
        u,v=rotate(usection,vsection,cable_angle)
        trans1=(v*dist*depth).sum()/1e6
        #print(date.strftime('%Y-%m-%d'),' transport:',transport,'Sv')
        transport.append(trans1)

    return transport

#-----------------------------------------------
# retrieve model data
#-----------------------------------------------
def get_model(dates,fcsts):

    transport={'dates':dates}


    for fcst in fcsts:
            transport[fcst]=calc_transport(dates,fcst)

    model=pd.DataFrame(transport)
    model.index=model.dates
    del model['dates']
    #del model['validDates']

    print(model)
    return model
#-----------------------------------------------
# coordinate rotation
#-----------------------------------------------
def rotate(u,v,phi):
    # phi is in radians
    u2 =  u*math.cos(phi) + v*math.sin(phi)
    v2 = -u*math.sin(phi) + v*math.cos(phi)
    return u2,v2

#-----------------------------------------------
if __name__ == "__main__":

    want_date=vDate
    DateSet=True

    fcst = int(os.environ.get('CALC_TRANSPORT_LEAD_TIME'))
    no_of_fcst_stat_days = int(os.environ.get('CALC_TRANSPORT_STATS_DAY'))

    fcsts=list(range(fcst,fcst+1,24))

    start_date=want_date
    stop_date=want_date
    cable=cable[:stop_date]

    # Count the number in the subdirs RTOFS dir
    path, dirs, files = next(os.walk(rtofsdir))
    dir_count = len(dirs)
    dir_count

    """
    Setup logging
    """
    logfile = os.environ.get('CALC_TRANSPORT_LOG_FILE')


    for end_date in pd.date_range(start_date,stop_date):
        dates=pd.date_range(end=end_date,periods=dir_count)
        model=get_model(dates,fcsts)

    both=pd.merge(cable,model,left_index=True,right_index=True,how='inner')
    print("both :", both)
    both=both[both.index.max()-timedelta(no_of_fcst_stat_days):]
     
    diff=both[fcst] - both.transport
    bias=diff.mean()
    rmse=mean_squared_error(both.transport,both[fcst])**0.5
    if both[fcst].mean() != 0.0:
        scatter_index=100.0*(((diff**2).mean())**0.5 - bias**2)/both.transport.mean()
    else:
        scatter_index=np.nan

    corr=both[fcst].corr(both.transport)

#    print("BIAS :",bias, "RMSE :",rmse, "CORR :",corr, "SCATTER INDEX :",scatter_index)

    outdir = os.environ.get('OUTPUT_DIR')

    if not os.path.exists(outdir):
        print(f"Creating output directory: {outdir}")
        os.makedirs(outdir)

    expected_file = os.path.join(outdir,logfile)
    print(expected_file)

    with open(expected_file, 'w') as f:
       print("BIAS :",bias, "RMSE :",rmse, "CORR :",corr, "SCATTER INDEX :",scatter_index, file=f)

Running METplus

This use case can be run two ways:

  1. Passing in UserScript_fcstRTOFS_obsAOML_calcTransport.conf then a user-specific system configuration file:

    run_metplus.py /path/to/METplus/parm/use_cases/model_applications/marine_and_cryosphere/UserScript_fcstRTOFS_obsAOML_calcTransport.conf /path/to/user_system.conf
    
  2. Modifying the configurations in parm/metplus_config, then passing in UserScript_fcstRTOFS_obsAOML_calcTransport.conf:

    run_metplus.py /path/to/METplus/parm/use_cases/model_applications/marine_and_cryosphere/UserScript_fcstRTOFS_obsAOML_calcTransport.conf
    

The former method is recommended. Whether you add them to a user-specific configuration file or modify the metplus_config files, the following variables must be set correctly:

  • INPUT_BASE - Path to directory where sample data tarballs are unpacked (See Datasets section to obtain tarballs). This is not required to run METplus, but it is required to run the examples in parm/use_cases

  • OUTPUT_BASE - Path where METplus output will be written. This must be in a location where you have write permissions

  • MET_INSTALL_DIR - Path to location where MET is installed locally

Example User Configuration File:

[config]
INPUT_BASE = /path/to/sample/input/data
OUTPUT_BASE = /path/to/output/dir
MET_INSTALL_DIR = /path/to/met-X.Y

Expected Output

A successful run will output the following both to the screen and to the logfile:

INFO: METplus has successfully finished running.

Refer to the value set for OUTPUT_BASE to find where the output data was generated. Output for use case will be found in calc_transport (relative to OUTPUT_BASE) and will contain the following files:

  • calc_transport.log

Keywords

Note

  • UserScriptUseCase

  • PythonEmbeddingFileUseCase

  • MarineAndCryosphereAppUseCase

Navigate to the METplus Quick Search for Use Cases page to discover other similar use cases.

sphinx_gallery_thumbnail_path = ‘_static/marine_and_cryosphere-UserScript_fcstRTOFS_obsAOML_calcTransport.png’

Total running time of the script: (0 minutes 0.000 seconds)

Gallery generated by Sphinx-Gallery