The Garden of Forking Paths: the Hidden Statistical Consequences of Data
Contingency and Researcher Degrees of Freedom in Cyclostratigraphic
Analysis, and Why Most Published Results are False
Abstract
Cyclostratigraphy’s near 100% success rate in statistical cycle
identification suggests confirmation bias; absence of cyclicity is not
regarded as a possible outcome. Vaughan et al 2011 (VBS) showed that the
usual methods of estimating confidence levels (CLs) admit numerous false
cycle detections, but in subsequent debate it is asserted that the
corrections recommended by VBS do not apply in cyclostratigraphy because
they lead to rejection of the expected orbital periods. Is there a
deeper problem? VBS particularly criticised universal failure to correct
CLs for the unavoidably multiple nature of significance tests of power
spectra. However, the multiple-test problem is compounded by assumptions
of unlimited freedom to vary procedures to allow for properties of
individual datasets. Statistical analysis in cyclostratigraphy operates
in a large variable-space, both of target hypotheses (many orbital
cycles and combinations thereof), and of procedures (many pre- and
syn-processing options). Each of the many data-contingent choices made
before and during spectral analysis and significance-testing implies the
existence of alternatives: in effect, the reported analysis is only one
of many. Given that multiple experiments will eventually achieve a
positive result purely by chance, unadjusted significance thresholds
will result in large numbers of spurious cycle identifications, a
possible explanation for observed success rates. Additional multiplicity
is implied by the practice of treating CLs as a guide, rather than as a
definitive signal:noise discriminator; treating CLs as movable (or even
optional) negates the concept that the particular dataset is just one
realisation of many permitted by the noise model; without pre-selection
of a CL the statistics are meaningless. Suggestions for practical
improvements include: better hypothesis formulation (with attention to
the prior probability of signal preservation in an unreliable recording
medium); more care in discriminating between the exploratory
(hypothesis-setting) and confirmatory (hypothesis-testing) modes of data
analysis; advance definition of analytical protocols; and publication of
all results whether positive or negative. Reference: Vaughan et al 2011:
doi:10.1029/2011PA002195.