Abstract
The construction of morphological character matrices is central to
paleontological systematic study, which extracts paleontological
information from fossils. Although the word information has been
repeatedly mentioned in a wide array of paleontological systematic
studies, its meaning has rarely been clarified and there has not been a
standard to measure paleontological information due to the
incompleteness of fossils, difficulty of recognizing homologous and
homoplastic structures, etc. Here, based on information theory, we show
the deep connections between paleontological systematic study and
communication system engineering. It is information, the decrease of
uncertainty, in morphological characters that distinguishes operational
taxonomic units (OTUs) and reconstructs evolutionary history. We propose
that concepts in communication system engineering such as source coding
and channel coding correspond in paleontological studies to the
construction of diagnostic features and the entire character matrices,
which should be distinguished as how typical communication systems are
engineered because these two steps serve dual purposes. With character
matrices from six different vertebrate groups, we analyzed their
information properties including source entropy, mutual information, and
channel capacity. Estimation of channel capacity shows upper limits of
all matrices in transmitting paleontological information, indicating
that, due to the presence of noise, too many characters not only
increase the burden in character scoring, but also may decrease quality
of matrices. Information entropy, which measure how informative a
variable is, of each character is tested as a weighting criterion in
parsimony-based systematic studies, the results show high consistence
with existing knowledge with both good resolution and interpretability.