Configuration files
There are two required configuration files for processing data: the global attributes file, which describes attributes that apply to the mooring, and the instrument configuration file, which describes attributes that apply to an instrument on a mooring. Contents of both files will be included as attributes in both the xarray Dataset and the netCDF files.
A note on time and time zones
Time is always in Coordinated Universal Time (UTC).
Transitioning from EPIC to CF Conventions
Historically, data have been released according to NOAA PMEL/EPIC conventions. Today, CF Conventions are used much more frequently, and stglib supports only CF Conventions. Specifying conventions is done via Conventions
keyword in either the global attributes file or the instrument configuration file.
Setting CF in global attributes
Conventions; CF-1.8
Setting CF in the instrument configuration file
Conventions: 'CF-1.8'
Specifying CF-1.8 or a later release of the standard will enable straight-to-CF processing.
Global attributes configuration file
This file describes attributes that apply to the mooring, and uses a peculiar formatting as shown in the example below.
1 SciPi; N. Ganju
2 PROJECT; USGS Coastal and Marine Geology Program
3 EXPERIMENT; Grand Bay
4 DESCRIPTION; Site GB1, Heron Bay
5 DATA_SUBTYPE; MOORED
6 DATA_ORIGIN; USGS WHCMSC Sed Trans Group
7 COORD_SYSTEM; GEOGRAPHIC
8 Conventions; PMEL/EPIC
9 MOORING; 1076
10 WATER_DEPTH; 1.55
11 WATER_DEPTH_NOTE; (meters), nominal depth
12 latitude; 30.37876
13 longitude; -88.38794
14 magnetic_variation; -1.88
15 Deployment_date; 2016-08-04 15:41
16 Recovery_date; 2016-10-19 20:10
17 DATA_CMNT;
18 platform_type; FG Lander
19 DRIFTER; 0
20 POS_CONST; 0
21 DEPTH_CONST; 0
22 WATER_MASS; Grand Bay, AL/MS
23 VAR_FILL; 1.e+35
24 institution; United States Geological Survey, Woods Hole Coastal and Marine Science Center
25 institution_url; https://woodshole.er.usgs.gov
Instrument configuration file
This file is instrument-specific and is YAML formatted. A few examples are given below.
Note
Although YAML supports boolean values, netCDF does not support them as attributes. Because stglib saves the values specified in the instrument configuration file as netCDF attributes, you must enclose values potentially interpreted as boolean (such as true or false) in quotation marks in the YAML file.
Options common to most (all?) instrument config files:
Conventions
: version of the CF Conventions,'CF-1.8'
presentlybasefile
: the input filename without extensionfilename
: output filename, to which-raw.cdf
,-a.nc
, etc. will be appendedClockError
: number, in seconds, negative is slow. Applies a simple offset for times. Useful if the instrument was deployed in an incorrect time zone.ClockDrift
: number, in seconds, negative is slow. Linearly interpolates times for when the instrument clock has drifted.initial_instrument_height
: elevation of instrument in metersinitial_instrument_height_note
P_1ac_note
: a note on the atmospheric pressure source usedzeroed_pressure
: a note detailing whether the pressure sensor was zeroed before deployment, and other pertinent details such as date and time of zeroing.good_dates
: a list of dates to clip data by instead of the defaultDeployment_date
andRecovery_date
. Example:good_dates: ['2021-01-22 18:32', '2021-04-13 19:27'] # first burst looked suspect
. Multiple date ranges can also be used. Example:good_dates: ['2021-01-22 18:32', '2021-02-28 23:59', '2021-04-01 00:00', '2021-04-13 19:27'] # the month of March was bad
good_ens
: a list of good indices (based on the raw file, zero-based) to clip the data by. Example:good_ens: [10, 500]
. To specify multiple good ranges, add additional pairs of indices:good_ens: [10, 500, 560, 600]
will clip the data to samples 10-500 and 560-600 in the final file.vert_dim
: user specified coordinate variable for vertical dimension for data variables with non-singular vertical dimension (default = ‘z’)
Multiple instruments
Options applicable to many instrument types include:
<VAR>_bad_ens
: specify bad ensemble ranges (either index numbers or dates) that should be set to_FillValue
. If you want multiple ranges, you can do this with additional values in the array. For example,Turb_bad_ens: ['2017-09-30 21:15', '2017-10-02 09:30', '2017-10-12 20:45', '2017-10-16 00:30']
. This will set the ranges in late September and early October, and again in mid-October, to_FillValue
.<VAR>_bad_ens_indiv
: specify ensembles (either index numbers or dates) that should be set to_FillValue
. For example,Turb_bad_ens: ['2017-09-30 21:15', '2017-10-02 09:30', '2017-10-12 20:45', '2017-10-16 00:30']
. This will set these four individual timestamps to_FillValue
.<VAR>_min
: fill values less than this minimum valid value. Values outside this range will become_FillValue
. Substitute your variable for<VAR>
, e.g.fDOMQSU_min
.<VAR>_max
: fill values more than this maximum valid value.<VAR>_min_diff
: fill values where data decreases by more than this number of units in a single time step. This will typically be a negative number.<VAR>_min_diff_pct
: fill values where data decreases by more than this percent in a single time step. This will typically be a negative number.<VAR>_max_diff
: fill values where data increases by more than this number of units in a single time step.<VAR>_max_diff_pct
: fill values where data increases by more than this percent in a single time step.<VAR>_med_diff
: fill values where difference between a 5-point (default) median filter and original values is greater than this number.<VAR>_med_diff_pct
: fill values where percent difference between a 5-point (default) median filter and original values is greater than this number.<VAR>_max_blip
: fill short-lived maximum “blips”, values that increase greater than this number and then immediately decrease at the next time step.<VAR>_max_blip_pct
: fill short-lived maximum “blips”, values that increase more than this percent and then immediately decrease at the next time step.<VAR>_trim_fliers
: fill flier values, which are data points surrounded by filled data. Set to the maximum size of flier clumps to remove.<VAR>_warmup_samples
: fill these many samples at the beginning of each burst.drop_vars
: a list of variables to be removed from the final file. For example,drop_vars: ['nLF_Cond_µS_per_cm', 'Wiper_Position_volt', 'Cable_Pwr_V']
.
Aquadopp
Aquadopp-specific options include:
trim_method
: can be'water level'
,'water level sl'
,'bin range'
,None
, or'none'
. Or just omit the option entirely if you don’t want to use it.<VAR>_trim_single_bins
: trim data where only a single bin of data (after trimming viatrim_method
) remains. Set this value totrue
to enable.<VAR>_maxabs_diff_2d
: trim values in a 2D DataArray when the absolute value of the increase is greater than a specified amountAnalogInput1_<ATTR>
orAnalogInput2_<ATTR>
: if<ATTR>
is “standard_name”, “long_name”, “units”, “institution”, “comment”, “source”, or “references”, this will create the appropriate attribute for the given variable.
For Aquadopp waves:
puv
: set totrue
to compute PUV wave statistics. (EXPERIMENTAL)
1basefile: 'AQ107703'
2filename: '10771Baqd' # name of output file, -raw.cdf or .nc will be appended to this
3LatLonDatum: 'NAD83'
4ClockError: 0 # sec, negative is slow
5orientation: 'UP' # use this to identify orientation of profiler
6initial_instrument_height: 0.15 # meters - estimated!!!
7initial_instrument_height_note: ''
8zeroed_pressure: 'Yes' # was pressure zeroed before deployment
9trim_method: 'water level sl' # Water Level SL trims bin if any part of bin or side lobe is out of water - works best when pressure is corrected for atmospheric
10# trim_method: 'bin range'
11# good_bins: [0,7] # with these two options, trim to the first 7 bins from the transducer
12P_1ac_note: 'Corrected for variations in atmospheric pressure using Grand Bay NERR met station (GNDCRMET).'
Signature
Signature-specific options include (see Aquadopp for others):
outdir
: output directory (make sure it exists) to write individualcdf
files before being compiled into a singlecdf
file per data typeorientation
: can beUP
orDOWN
use this to identify orientation of profiler
1basefile: 'AQ107703'
2filename: '10771Baqd' # name of output file, -raw.cdf or .nc will be appended to this
3LatLonDatum: 'NAD83'
4ClockError: 0 # sec, negative is slow
5orientation: 'UP' # use this to identify orientation of profiler
6initial_instrument_height: 0.15 # meters - estimated!!!
7initial_instrument_height_note: ''
8zeroed_pressure: 'Yes' # was pressure zeroed before deployment
9trim_method: 'water level sl' # Water Level SL trims bin if any part of bin or side lobe is out of water - works best when pressure is corrected for atmospheric
10# trim_method: 'bin range'
11# good_bins: [0,7] # with these two options, trim to the first 7 bins from the transducer
12P_1ac_note: 'Corrected for variations in atmospheric pressure using Grand Bay NERR met station (GNDCRMET).'
RBR instruments
Options specific to RBR instruments exported from the Ruskin software include:
basefile
: the input filename without extension or data type. For example, if your exported text files are named055170_20190219_1547_burst.txt
,055170_20190219_1547_data.txt
, etc.,basefile
will be055170_20190219_1547
.wp_min
,wp_max
: min/max allowable wave period, in secondswh_min
,wh_max
: min/max allowable wave height, in meterswp_ratio
: maximum allowable ratio between peak period (wp_peak
) and mean period (wp_4060
).<VAR>_min
: fill values less than this minimum valid value. Values outside this range will become_FillValue
. Substitute your variable for<VAR>
, e.g.P_1ac_min
. Only works forP_1
andP_1ac
. Useful for trimming by minimum pressure for instruments that go dry on some tidal cycles. Any data within the burst less than the threshold will result in the full burst being filled.
1basefile: '055110_20161020_1503'
2filename: '10793Adw' # name of output file, -raw.cdf or .nc will be appended to this
3LatLonDatum: 'NAD83'
4initial_instrument_height: 0.15 # meters - estimated!!!
5wp_max: 4
6wh_min: 0.02
7wp_ratio: 2
8P_1ac_note: 'Corrected for variations in atmospheric pressure using Grand Bay NERR met station (GNDCRMET).'
When an RBR instrument is used in CONTINUOUS
mode as a profiling instrument (e.g., twisting the endcap to start/stop a profile), include the following line in your configuration file:
featureType: 'profile'
: this CF-compliantfeatureType
instructs stglib to process these data as a profile dataset.latitude: [36.959, 41.533, 27.764]
,longitude: [-122.056, -70.651, -82.638]
: these values can each be specified as a YAML list of latitudes and longitudes, each element in the lists corresponding to a profile.split_profiles
: when set to True, split a multi-profile dataset into individual netCDF files for each profile
EXO
EXO-specific options include:
skiprows
: number of lines to skip in the CSV before the real data begins
Note that negative numeric values in the YAML config file must be treated with care so as not to be interpreted as strings. If you want the minimum value to be, say, -0.2 units for a particular parameter, you must write this as -0.2
and not -.2
in the config file. The latter format will be interpreted as a string and will cause an error.
1basefile: 'GB0014_14D100014_080316_120000'
2filename: '10762Aexo' # name of output file, -raw.cdf or .nc will be appended to this
3# SN: '14D100014'
4LatLonDatum: 'NAD83'
5ClockError: 0 # sec, negative is slow
6initial_instrument_height: 0.15 # meters - estimated!!!
7initial_instrument_height_note: ''
8zeroed_pressure: 'Yes' # was pressure zeroed before deployment
9P_1ac_note: 'Corrected for variations in atmospheric pressure using Grand Bay NERR met station (GNDCRMET).'
10skiprows: 25
11#fDOMRFU_max_diff: 3
12#fDOMQSU_max_diff: 30
13C_51_min_diff: -0.3
14SpC_48_min_diff: -2.5
15S_41_min_diff: -2
16Turb_max_diff: 100
17# Example of how to trim by specifying the bad ensembles that should be removed.
18# Here we will remove C_51 values at in ensembles 500:600 and 905:910.
19# You must specify these ranges as pairs, start and end
20# This will delete 500-599 and 905-909
21C_51_bad_ens: [500, 600, 905, 910]
22# Here's an example of just removing a single value (51):
23S_41_bad_ens: [51, 52]
24# Or an single range (200-250). Note that Python's indexing means that this
25# will actually remove values 200 through 249.
26Turb_bad_ens: [200, 250]
WET Labs ECO NTU
NTU-specific options include:
All the _min, _max, _bad_ens, etc. options available to the EXO.
Turb_std_max
: fill turbidity based on a maximum standard deviation value.spb
: samples per burstuser_ntucal_coeffs
: polynomial coefficients, e.g.,[9.078E-07, 5.883E-02, -2.899E+00]
.
Vaisala WXT536
WXT-specific options include:
RTK_elevation_NAVD88
: RTK elevation of the sensor referenced to NAVD88 in meters.dir_offset
: a direction offset in degrees from magnetic north to be applied if the sensor was not pointing toward magnetic north.dir_offset_note
: a note about the direction offset being used.
EofE ECHOLOGGER
All the _min, _max, _bad_ens, etc. options available to the EXO.
instrument_type
: types “ea” and “aa” are supported.orientation
: orientation of transducers types ‘DOWN’ or ‘UP’ are supported.average_salinity
: average salinity value (PSU) for the water mass for the deployment site and time period.average_salinity_note
: source of average salinity value.
Sequoia Scientific LISST
operating_mode
: set toburst
if instrument was deployed in burst mode
Sontek IQ
All the _min, _max, _bad_ens, etc. options available to the EXO.
orientation
: can beUP
orDOWN
use this to identify orientation of profilerpositive_direction
: direction (degrees) of positive flow indicated by the X arrow on top of instrument (optional, recommended)flood_direction
: direction (degrees) of flood current in channel, may be opposite of positive flow direction depending on field set up (optional, recommended)channel_cross_section_note
: note specifying starting bank (left or right) for RTK transect across the channel and when the transect measurements were collected (optional, recommended)
Onset Hobo
All the _min, _max, _bad_ens, etc. options available to the EXO.
instrument_type
: can behwl
(water level),hwlb
(water level as barometer),hdo
(dissolved oxygen) orhcnd
(conductivity) use these based on parameter measured by hobo loggerskipfooter
: number of lines to skip in the CSV file at the end of the filencols
: number of columns of data to read, starting at firstnames
: option for user specified column names (only recommended when code will not read names using automated/default method)
Lowell TCM Hobo
All the _min, _max, _bad_ens, etc. options available to the EXO.
skipfooter
: number of lines to skip in the CSV file at the end of the filencols
: number of columns of data to read, starting at firstnames
: option for user specified column names (only recommended when code will not read names using automated/default method)
Vector
pressure_sensor_height
andvelocity_sample_volume_height
to specify the elevations of these two sensors.puv
: set totrue
to compute PUV wave statistics. (EXPERIMENTAL)orientation
:UP
means probe head is pointing up (sample volume above probe head).DOWN
means probe head is pointing down (sample volume below probe head).