4. Using METdbLoad
The METdbLoad module provides support for inserting MET output data (as .stat files) into a relational database (mysql, mariadb, or aurora).
4.1. Create Database
Before using the METdbLoad module, the database must exist and have the proper permissions (i.e. grant privileges to insert, delete, update, and index).
Data must be loaded into a database which has the prefix 'mv_', (e.g. mv_met_data). This database must be structured with the METviewer mv_mysql.sql schema:
cd [install]/metviewer
mysql -u[db_username] -p[db_password] -e'create database [db_name];'
mysql -u[db_username] -p[db_password] [db_name] < sql/mv_mysql.sql
Replace db_usenamer with the username
Replace db_passwd with the password
Replace db_name with the appropriate database name that begins with mv (e.g. mv_xyz )
Note
Use the mysql command line client (https://dev.mysql.com/doc/refman/en/mysql.html) to run the above instructions.
The mv_mysql.sql file is located in the METdataio/METdbLoad/sql/ directory.
Create MET linetype tables
mysql -udbuser -pdbpasswd dbtable_name < /path-to-METdataio/METdbLoad/sql/mv_mysql.sql
Replace dbuser with the username
Replace dbpasswd with the password
Replace dbtable_name with the appropriate database name that begins with mv (e.g. mv_xyz )
Replace path-to-METdataio with the path where the METdataio source code is saved
Note the use of the redirection symbol ‘<’ in the command
4.2. Create an XML Load Specification Document
The XML load specification document contains information about the MET data, database connection information, and the location of data to be loaded. This is used by the METdbLoad script met_db_load.py to load the MET data. The XML load specification is validated against a schema to check that the file is valid. This validation is necessary to prevent extremely large payloads or recursive payloads that can compromise the loading of data. The elements in the XML load specification file must adhere to the order specified by the XML schema and conform to size and number of element limitations. Click to expand the example XML load specifications and the XML validation schema below.
An example XML specification file that is valid
For defining data organized by dates using the folder_tmpl element:
<load_spec>
<connection>
<host>yourhost:1234</host>
<database>your_database_name</database>
<user>your_db_username</user>
<password>your_password</password>
</connection>
<date_list name="folder_dates">
<start>2022050100</start>
<end>2022051200</end>
<inc>86400</inc>
<format>yyyyMMddHH</format>
</date_list>
<date_list name="valid_dates">
<start>2022050100</start>
<end>2022051200</end>
<inc>0600</inc>
<format>yyyyMMddHH</format>
</date_list>
<verbose>false</verbose>
<insert_size>1</insert_size>
<stat_header_db_check>False</stat_header_db_check>
<mode_header_db_check>false</mode_header_db_check>
<drop_indexes>FALSE</drop_indexes>
<apply_indexes>true</apply_indexes>
<load_stat>True</load_stat>
<load_mode>false</load_mode>
<load_mpr>false</load_mpr>
<group>Regional Ensemble</group>
<folder_tmpl>
/var/autofs/mnt/hostmachine/projects/RRFS/prototype/met_out/{config}/{fcst_init}/{mem}/{valid_times}/metprd/{met_out}/
</folder_tmpl>
<load_val>
<field name="config">
<val>HREF_lag_offset</val>
<val>RTPS</val>
</field>
<field name="fcst_init">
<date_list name="folder_dates"/>
</field>
<field name="mem">
<val>mem01</val>
<val>mem02</val>
<val>mem03</val>
<val>mem04</val>
<val>mem05</val>
<val>mem06</val>
<val>mem07</val>
<val>mem08</val>
<val>mem09</val>
<val>mem10</val>
</field>
<field name="valid_times">
<date_list name="valid_dates"/>
</field>
<field name="met_out">
<val>grid_stat_cmn</val>
<val>point_stat_cmn</val>
</field>
</load_val>
<line_type>
<val>CNT</val>
<val>ORANK</val>
<val>SEEPS_MPR</val>
<val>ME</val>
</line_type>
<load_xml>true</load_xml>
<load_note>Load HREF and RTPS data for Spring 2022.</load_note>
</load_spec>
Another example XML specification file that is valid
For specifying a list of input data files using the load_files element:
<load_spec>
<connection>
<host>yourhost:1234</host>
<database>your_database_name</database>
<user>your_db_username</user>
<password>your_password</password>
</connection>
<verbose>true</verbose>
<insert_size>1</insert_size>
<stat_header_db_check>False</stat_header_db_check>
<mode_header_db_check>false</mode_header_db_check>
<drop_indexes>FALSE</drop_indexes>
<apply_indexes>true</apply_indexes>
<load_stat>True</load_stat>
<load_mode>false</load_mode>
<load_mpr>false</load_mpr>
<group>Regional Ensemble</group>
<load_files>
<file>/met_out/mode/mode_MASK_POLY_300000L_20120410_180000V_060000A_cts.txt</file>
<file>/met_out/rhist/ensemble_stat_RRFS_GEFS_GF.SPP.SPPT_RETOP_MRMS_20220507_120000V.stat</file>
</load_files>
<load_xml>true</load_xml>
<load_note>Load MODE and RHIST for load_files.</load_note>
</load_spec>
The XML schema that is used to validate the XML specification file
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!-- Define special types used to impose size limitations on payload-->
<xs:simpleType name="inserttype">
<!-- only insert size of 1 -->
<xs:restriction base="xs:positiveInteger">
<xs:enumeration value="1"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="truefalsetype">
<!--Checking for true or false values only (allow for case-insensitive) -->
<xs:restriction base="xs:string">
<xs:pattern value="((t|T)(r|R)(u|U)(e|E))|((f|F)(a|A)(l|L)(s|S)(e|E))"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="usernametype">
<!--Limit the username to 32 characters (mysql maximum) -->
<xs:restriction base="xs:string">
<xs:pattern value="[\s\-_a-zA-z0-9]+"/>
<xs:minLength value="3"/>
<xs:maxLength value="32"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="hostnametype">
<!--Limit the hostname to 60 characters (mysql maximum) add 7 for port number -->
<xs:restriction base="xs:string">
<xs:pattern value="([.\-_a-zA-z0-9]+:\d+)"/>
<xs:minLength value="3"/>
<xs:maxLength value="67"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="dbnametype">
<!--Limit the database name to max 124 characters based on Stack Overflow -->
<xs:restriction base="xs:string">
<xs:pattern value="[\-_a-zA-z0-9]+"/>
<xs:minLength value="3"/>
<xs:maxLength value="124"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="pwdtype">
<!--Limit the password length -->
<xs:restriction base="xs:string">
<xs:minLength value="3"/>
<xs:maxLength value="30"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="foldertmpl">
<!--Checking for folder template name, limit to max 4096 characters -->
<xs:restriction base="xs:string">
<!-- Allow _, #, -, / in path -->
<!-- https://unix.stackexchange.com/questions/32795/what-is-the-maximum-allowed-filename-and-folder-size-with-ecryptfs -->
<xs:pattern value=".+"/>
<xs:minLength value="3"/>
<xs:maxLength value="4096"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="mgmtsystype">
<!--Checking for mariadb|aurora|mysql values in management_system-->
<xs:restriction base="xs:string">
<xs:enumeration value="mysql"/>
<xs:enumeration value="aurora"/>
<xs:enumeration value="mariadb"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="limitedstringtype">
<!--Limit the database group and description -->
<xs:restriction base="xs:string">
<xs:pattern value="[\s\-._,a-zA-z0-9]+"/>
<xs:minLength value="1"/>
<xs:maxLength value="300"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="linetype">
<xs:restriction base="xs:string">
<xs:minLength value="2"/>
<xs:maxLength value="25"/>
<xs:pattern value="[_a-zA-Z0-9]+"/>
</xs:restriction>
</xs:simpleType>
<!-- COMPLEX TYPES -->
<xs:element name="load_spec">
<xs:complexType>
<xs:sequence>
<xs:element name="connection">
<xs:complexType>
<xs:sequence>
<xs:element type="mgmtsystype" name="management_system" minOccurs="0"/>
<xs:element type="hostnametype" name="host"/>
<xs:element type="dbnametype" name="database"/>
<xs:element type="usernametype" name="user"/>
<xs:element type="pwdtype" name="password"/>
<xs:element type="truefalsetype" name="local_infile" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="date_list" minOccurs="0" maxOccurs="5">
<xs:complexType>
<xs:sequence>
<xs:element type="xs:int" name="start"/>
<xs:element type="xs:int" name="end"/>
<xs:element type="xs:int" name="inc"/>
<xs:element type="xs:string" name="format"/>
</xs:sequence>
<xs:attribute type="xs:string" name="name"/>
</xs:complexType>
</xs:element>
<!-- Header and other flags -->
<xs:element type="truefalsetype" name="verbose"/>
<xs:element type="xs:positiveInteger" name="insert_size"/>
<xs:element type="truefalsetype" name="stat_header_db_check" minOccurs="0"/>
<xs:element type="truefalsetype" name="mode_header_db_check" minOccurs="0"/>
<xs:element type="truefalsetype" name="mtd_header_db_check" minOccurs="0"/>
<xs:element type="truefalsetype" name="drop_indexes" minOccurs="0"/>
<xs:element type="truefalsetype" name="apply_indexes" minOccurs="0"/>
<xs:element type="truefalsetype" name="load_stat" minOccurs="0"/>
<xs:element type="truefalsetype" name="load_mode" minOccurs="0"/>
<xs:element type="truefalsetype" name="load_mtd" minOccurs="0"/>
<xs:element type="truefalsetype" name="load_mpr" minOccurs="0"/>
<xs:element type="truefalsetype" name="load_orank" minOccurs="0"/>
<xs:element type="truefalsetype" name="force_dup_file" minOccurs="0"/>
<xs:element type="limitedstringtype" name="group" minOccurs="0"/>
<xs:element type="limitedstringtype" name="description" minOccurs="0"/>
<xs:element name="load_files" minOccurs="0">
<xs:complexType>
<xs:sequence>
<xs:element type="xs:string" name="file" minOccurs="1" maxOccurs="200"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element type="foldertmpl" name="folder_tmpl" minOccurs="0"/>
<xs:element name="load_val" minOccurs="0">
<xs:complexType>
<xs:sequence>
<xs:element name="field" minOccurs="0" maxOccurs="20">
<xs:complexType>
<xs:sequence>
<xs:element name="date_list" minOccurs="0" maxOccurs="5">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute type="xs:string" name="name"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
<xs:element type="xs:string" name="val" minOccurs="0" maxOccurs="100"/>
</xs:sequence>
<xs:attribute type="xs:string" name="name"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="line_type" minOccurs="0" maxOccurs="1">
<xs:complexType>
<xs:sequence>
<xs:element type="linetype" name="val" maxOccurs="800" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element type="xs:string" name="load_xml" minOccurs="0"/>
<xs:element type="xs:string" name="load_note" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
4.2.1. Create the XML Specification File
The XML load specification file contains the database connection information, data file location, and other instructions for loading the data into the database.
Create the XML load specification file by copying the example specification file METdataio/METdbLoad/test/Examples/example_load_specification.xml file to the workspace location.
Do not save this XML specification file where it can be read by anyone who should not have access to this information, as this file contains the database password.
cp $METDATAIO_HOME/METdbLoad/test/Examples/example_load_specification.xml path-to-your-dir/load_specification.xml
$METDATAIO_HOME is the path to the location of the cloned or forked METdataio source code.
Replace the path-to-your-dir with the actual path to where this file will be saved.
Change directory to the location where the example_load_specification.xml file was copied.
cd path-to-your-dir/load_specification.xml
Replace path-to-your-dir with the full path where the XML specification file will be saved.
Make the necessary edits to the required elements and delete any optional, unused/irrelevant elements, based on the explanation below (click to expand). Remember to update the username and password that is applicable to the database.
Note
The order of the elements in the XML load specification file is crucial. DO NOT modify the order of the described elements (below).
The following is an explanation of the required and optional elements and any limitations
These are element names. The XML angle brackets (<>) as seen in the XML load specification file are omitted for simplicity. Indentation is used to indicate hierarchical relationships between elements.
load_spec
mandatory
top-level tag/element
container for other elements that define connection information, flags, data input, etc.
The following elements pertain to logging into the database.
connection
mandatory
tag for connection information
management_system
optional
indicates which database is in use
recognized/expected values are one of the following:
aurora
mysql
mariadb
delete this element if not using
host
mandatory
name of host/machine where database is installed
format is hostname:port number
minimum number of characters is 3
maximum number of characters is 67
allowable characters (combinations of any of these):
upper and lower alphabetical characters (English)
digits 0-9
., -, _ (period, dash, underscore)
database
mandatory
name of the database
maximum number of characters for database name is 124
- allowable characters (combination of any of these):
_,- (underscore, dash)
upper and lower case alphabetical characters (English)
digits 0-9
user
mandatory
user name
minimum number of characters is 3
maximum number of characters is 32
allowable characters (combination of any of these):
upper and lower case alphabetical characters (English)
digits 0-9
_,- (underscore, dash)
password
mandatory
the password to access the database
minimum number of characters is 3
maximum number of characters is 30
all characters are allowed
local_infile
optional
argument passed into 3rd party Python library pymysql
for establishing a connection to a MySQL server
indicate whether the input file is local
default is False
enables use of the LOAD DATA LOCAL command
Accepted value: - Boolean value: True or False
True if loading local data
False otherwise
delete this element if loading of local data is not needed
METdataio sets default to False if this element is absent
The following elements are used to define the format of multiple input data directories that are (optionally) organized by datetime.
date_list
optional
for describing data organized in datetime subdirectories
omit date_list entries if data resides in a singular directory
multiple date_list elements are allowed
maximum number of date_lists is 5
differentiate different date_list definitions by the name attribute (i.e. name=)
Example:
/var/autofs/mnt/hostmachine/projects/RRFS/prototype/met_out/{config}/{fcst_init}/{mem}/{valid_times}/metprd/{met_out}
the fcst_init and valid_times subdirectories are based on datetime
assign the fcst_init subdirectory to a descriptively named date_list attribute:
e.g. <date_list name=”folder_dates”>
this attribute name will be used in the load_val element within the folder_tmpl element to describe the {fcst_init} subdirectory template
assign the valid_times subdirectory to a descriptively named date_list attribute:
e.g. <date_list name=”valid_dates”>
this attribute name will be used in the load_val element within the folder_tmpl element to describe the {valid_times} subdirectory template
start
mandatory if date_list is being used
start datetime
end
mandatory if date_list is being used
end datetime
inc
mandatory if date_list is being used
increment/step size between start and end time
Example, if 6-hour increment:
set inc to 0600
<inc>0600</inc>
format
mandatory if date_list is being used
format of the datetime
- For example, to specify 4 digit year, 2 digit month, 2 digit day, and 2 digit hour:
<format>yyyyMMddHH</format>
The following elements define various flags
verbose
mandatory
indicates the desired volume of output from the load module
TRUE resulting in more information
FALSE resulting in less information
insert_size
mandatory
An integer indicating the number of MET output file rows inserted with each INSERT statement
This value is most often 1
stat_header_db_check
optional
indicate whether a database query check for stat header information should be performed
True or False (case insensitive)
WARNING enabling this feature (i.e. set to True) could significantly increase load time
mode_header_db_check
optional
indicate whether a database query check for the MODE header information should be performed
True or False (case insensitive)
WARNING enabling this feature (i.e. set to True) could significantly increase load time
mtd_header_db_check
optional
indicate whether a database query check for the MODE TD header information should be performed
True or False (case insensitive)
WARNING enabling this feature (i.e. set to True) could significantly increase load time
drop_indexes
optional
indicate whether to drop database indexes before loading new data
True or False (case insensitive)
apply_indexes
optional
indicate whether to apply database indexes
True or False (case insensitive)
load_stat
optional
indicate whether or not to load STAT data
True or False (case insensitive)
load_mode
optional
indicate whether or not to load MODE data
True or False (case insensitive)
load_mtd
optional
indicate whether or not to load MODE TD (MODE Time Domain) data
True or False (case insensitive)
load_mpr
optional
indicate whether or not to load MPR (matched pair) data
True or False (case insensitive)
load_orank
optional
indicate whether or not to load ORANK (observed rank) data
True or False (case insensitive)
force_dup_file
optional
indicate whether or not to force load paths/files that already exist
True or False (case insensitive)
The following elements indicate which group the database should be assigned and a description.
group
optional
the name of the database group (databases are grouped in METviewer: e.g. Testing)
if undefined, the database will be placed under the NO GROUP group
minimum number of characters is 1
maximum number of characters is 300
acceptable characters (English), any combination:
upper and/or lower case alphabetic characters
any digits 0-9
_, . , - (underscore, period, dash)
description
optional
description of the database
minimum number of characters is 1
maximum number of characters is 300
acceptable characters (English), any combination:
upper and/or lower case alphabetic characters
any digits 0-9
_, . , - (underscore, period, dash)
Determine whether to use folder_tmpl or load_files to define where the input data resides (descriptions below):
The following defines the location of the input data to be loaded into the database based on data organized by datetime (and/or any other criteria).
folder_tmpl
for data in subdirectories that are datetimes OR data that resides under one directory
only one folder template element is permitted (i.e. only one <folder_tmpl> … </folder_tmpl> )
NOTE the date_list element MUST BE DEFINED (see above in the date_list description) if any subdirectories are based on datetime
Specify the directory where the data is located in one of the following methods:
Using value templates for directories:
Example:
/var/autofs/mnt/hostmachine/projects/RRFS/prototype/met_out/{config}/{fcst_init}/{mem}/{valid_times}/metprd/{met_out}
data is organized into various directories based on datetime, and other criteria
use { } around “variable” names (in XML, these indicate attribute value templates)
config, fcst_init, mem, valid_times, and met_out are attribute value template values that must be defined under the load_val element (for more details, refer to the load_val description below)
Specify a single directory where all data reside:
Example:
/var/autofs/mnt/hostmachine/projects/RRFS/prototype/met_out/mem00/metprd/all_runs
all datafiles are located under this directory (indicate the full path)
load_val
optional if folder_tmpl specifies a single directory where all data resides
- mandatory if folder_tmpl has datetime subdirectories
field elements correspond to each attribute value template (i.e. variable names enclosed in {})
field
mandatory if folder_tmpl has subdirectories that are datetimes
each field element defines the attribute value template in the directory structure (i.e. the variable inside the {}) - val elements that can specify more subdirectories
optional
necessary when specifying subdirectories that are not datetimes (e.g. /path-to/model_A/air_quality/)
maximum number of val elements: 100
date_list elements for subdirectories that are datetimes
optional
mandatory when subdirectories are datetimes (e.g. /path-to/20240101/model/)
maximum number of date_list elements: 5
For this folder_tmpl example, which has datetime subdirectories:
/var/autofs/mnt/hostmachine/projects/RRFS/prototype/met_out/{config}/{fcst_init}/{mem}/{valid_times}/metprd/{met_out}
The following are the name attributes for the field element for the above example:
config
corresponds to the {config} template:
<field name=”config”>
val
for defining non-datetime subdirectories
maximum number of val elements is 100
- e.g.:
<field name=”config”>
<val>HREF_lag_offset</val>
<val>RTPS</val>
</field>
fcst_init
corresponds to the {fcst_init} template:
<field name=”fcst_init”>
date_list
the name attribute corresponds to one of the date_list attribute names
<field name=”fcst_init”>
<date_list name=”folder_dates”/>
</field>
The “folder_dates” name attribute was assigned in the date_list portion of the XML load specification file
<date_list name=”folder_dates”>
<start>2022050100</start>
<end>2022051200</end>
<inc>86400</inc>
<format>yyyyMMddHH</format>
</date_list>
mem
corresponds to the {mem} template:
<field name=”mem”>
val
for defining non-datetime subdirectories
maximum number of val elements is 100
e.g.:
<field name=”mem”>
<val>mem01</val>
<val>mem02</val>
<val>mem03</val>
<val>mem04</val>
<val>mem05</val>
<val>mem06</val>
<val>mem07</val>
<val>mem08</val>
<val>mem09</val>
<val>mem10</val>
</field>
valid_times
corresponds to the {valid_times} template:
<field name=”valid_times”>
date_list
the name attribute corresponds to one of the date_list attribute names
<field name=”valid_times”>
<date_list name=”valid_dates”>
</field>
the “valid_dates” name attribute value was assigned in the date_list portion of the XML load specification file
<date_list name=”valid_dates”>
<start>2022050100</start>
<end>2022051200</end>
<inc>0600</inc>
<format>yyyyMMddHH</format>
</date_list>
met_out
corresponds to the {met_out} template
val
for defining non-datetime subdirectories
maximum number of val elements is 100
e.g.:
<field name=”met_out”>
<val>grid_stat_cmn</val>
<val>point_stat_cmn</val>
</field>
For this folder_tmpl example, which has all the data under one directory (no datetime subdirectories):
/var/autofs/mnt/hostmachine/projects/RRFS/prototype/met_out/metprd/point_stat
The following defines the location of specific input data files to be loaded into the database.
load_files
for specifying the location of individual data files that are in different directories
file
mandatory for defining location of specific data files
maximum number of data files: 200
Example:
<load_files>
<file>/met_out/mode/mode_MASK_POLY_300000L_20120410_180000V_060000A_cts.txt</file>
<file>/met_out/rhist/ensemble_stat_RRFS_GEFS_GF.SPP.SPPT_RETOP_MRMS_20220507_120000V.stat</file>
</load_files>
The following describe the linetypes to load.
line_type
optional
which MET output linetypes to load
if omitted, then all linetypes will be loaded
maximum number of line_type elements: 1
val element defines which MET output linetypes to load
val
the MET output linetype
linetype name (refer to the MET User’s Guide for a complete list of linetypes)
Example:
<line_type>
<val>ECNT</val>
<val>VL1L2</val>
<val>SAL1L2</val>
</line_type>
The following option allows one to indicate whether to save the XML commands into the database.
load_xml
optional
Option to save the XML into the database
Only takes effect when the load_note element is present
Acceptable values: TRUE or FALSE (case-insensitive)
Default value: TRUE
Example:
<load_xml>true</load_xml>
The following allows one to create a note into the instance_info database table.
load_note
optional
Add a descriptive “note” into the database
Example:
<load_note>Load HREF and RTPS data from Spring 2022</load_note>
4.2.2. Load Data
Now the MET data can be loaded in the database using the met_db_load.py script in the path-to-METdataio-source/METdbLoad/ush directory. The path-to-METdataio-source is the directory where the METdataio source code is saved.
Note
Only data files with the .stat extension will be loaded
The usage statement for met_db_load.py:
INFO:root:--- *** --- Start METdbLoad --- *** ---
usage: python met_db_load.py [-h] [-index] xmlfile [tmpdir [tmpdir ...]]
positional arguments:
xmlfile Please provide required xml load_spec filename
tmpdir Optional - when different directory wanted for tmp file
optional arguments:
-h, --help show this help message and exit
-index Only process index, do not load data
cd /path-to-METdataio-source/METdataio/METdbLoad/ush
python met_db_load.py /path-to/load_specification.xml
If logging is desired, redirect output to a log file (via &> command):
python met_db_load.py /path-to/load_specification.xml &> log/your_logname.log &
Replace path-to-METdataio-source to the path to the location of the cloned or forked METdataio source code.
Replace the path-to with the location where the load_specification.xml XML load specification file was saved.
Refer to the section Create the XML Specification File and expand the drop-down instructions “The following is an explanation of the required and optional elements and any limitations” for details on what is expected in the XML load specification file.
4.3. Troubleshooting
Error: |
ERROR: Caught class com.mysql.jdbc.exceptions.MySQLIntegrityConstraintViolationException: Duplicate entry ‘CT07-NMM-LIN-R2-0-2005-07-15 12:00:00-2005-07-15 12:00:00-0-2005’ for key 2 |
Solution: |
This error is caused by trying to insert a stat_header record into the database when an identical one already exists. If identical stat_header information is present in more than one stat file, set the <stat_header_db_check> value to true. This setting will reduce performance, because the stat_header table is checked for duplicate stat_header each time a row is inserted. However, if a stat_header row already exists in the table with the insert information, then the existing record will be used instead of trying to insert a duplicate. |
Error: |
ERROR:root: (1049, “Unknown database ‘mv_test’”) in run_sql Error when connecting to database |
Solution: |
This error is caused when attempting to load data into a database that does not exist. You will need to create the database, set up the appropriate privileges as outlined above, and load the schema using the mv_mysql.sql file. |
Error: |
/full-path-to/xyz.xml is not valid and may contain a recursive payload or an excessively large payload |
Solution: |
This error is typically encountered when one of the following conditions exist as a result of failing the XML validation step:
|