Rapid automatic clean-up toolkit for large corrupted tidal datasets
- Vamsi Krishna Sridharan
Abstract
Tides are critical to coastal and oceanic processes. While tidal data
are available readily, they are often corrupted by various sources of
error. An automated, fast MATLAB toolbox is developed to clean-up tidal
timeseries data from estuarine and oceanic locations corrupted by
errors. This toolbox will immensely speed up delivery of
quality-controlled tidal data. It will also reduce errors in quality
control, which typically involves several manual tasks. The toolbox
corrects poorly interpolated and noisy data, erroneous outliers, and
instrumentation bias such as spurious jumps, drifts, spikes, and
modulations in the true signal. Signal clean-up involves multiple
stages. First, thresholds are imposed on higher order temporal
derivatives of the signal to remove gross interpolations and noise
saturated signal chunks, followed by a moving median threshold to remove
outliers. Then the surviving signal is filtered into tidal, subtidal and
long-period components, and the long-period component is subject to a
maximal overlap discrete wavelet transformation, in which the transform
coefficients corresponding to multi-scale edge features are removed.
Subsequently, local information in the subtidal and tidal components is
compared relative to the whole signal to correct spurious amplitude
modulations and sudden biases. Consequently, these components are added
to recover the uncorrupted signal, and large data gaps are filled with
short term harmonic reconstruction. For estuarine locations, the
correlation in the spectrogram between two nearby stations is initially
used to quantify and remove river influence in the signal. Applications
to datasets at multiple global locations demonstrate the value of the
toolbox.