Influence of data filters on the accuracy and precision of paleomagnetic
poles: what is the optimal sampling strategy?
Abstract
To determine a paleopole, the paleomagnetic community commonly applies a
loosely defined set of quantitative data filters that were established
for studies of geomagnetic field behavior. These filters require costly
and time-consuming sampling procedures, but whether they improve
accuracy and precision of paleopoles has not yet been systematically
analyzed. In this study, we performed a series of experiments on four
datasets which consist of 73-125 lava sites with 6-7 samples per lava.
The datasets are from different regions and ages, and are large enough
to represent paleosecular variation, yet contain demonstrably unreliable
datapoints. We show that data filters based on within-site scatter (a
k-cutoff, a minimum number of samples per site, and eliminating the
farthest outliers per site) cannot objectively identify unreliable
directions. We find instead that excluding unreliable directions relies
on the subjective interpretation of the expert, highlighting the
importance of making all data available following the FAIR principles.
In addition, data filters that eliminate datapoints even have an adverse
effect: the accuracy as well as the precision of paleopoles decreases
with the decreasing number of data. Between-site scatter far outweighs
within-site scatter, and when collecting paleomagnetic poles, the extra
efforts put into collecting multiple samples per site are more
effectively spent on collecting more single-sample sites.