Telomere length estimates correlated between WGS and qPCR
We observed a correlation (r = 0.50 – 0.66) between telomere
length estimates using qPCR and WGS data. Previous studies using humans
observed a moderate to high correlation between qPCR and WGS data
(r = 0.66 – 0.95, . For Populus trees, lower correlations
may reflect differences in bioinformatics approaches that filter out
interstitial telomeric sequences and tolerate telomere repeat variants.
Telomere studies in humans avoid the inclusion of interstitial telomeric
repeats by identifying telomeric reads that are aligned with the
telomere region in the reference genome. Programs such as TelSeq ,
Telomere Hunter , and Telomerecat require alignment of reads to a
reference genome prior to telomere calculations (i.e., BAM files) and
telomere length estimations are based on reads that align to the
telomeric region of the genome. This approach reduces the probability of
capturing interstitial telomeric repeats, but the use and efficiency of
the program depend heavily on the completeness of the
telomere-to-telomere assembly of the reference genome . While
telomere-to-telomere assemblies are available in humans most species
lack high resolution across telomeric regions. For the Populus
trichocarpa reference genome, the telomeric region contains a
significant proportion of ambiguous bases (i.e., NNNN) reducing the
probability of alignment to the telomere region. In this study, we
limited the estimation of telomere length to programs that used unmapped
read sequence data. Thus, telomere estimates from both K-seek and TRIP
may include some interstitial telomeric repeats. Nonetheless, previous
studies have shown that considering consecutive telomeric repeats
decreases the probability of capturing interstitial telomeric repeats
and increases the correlation between qPCR and TRF telomere estimates .
K-seek and TRIP considered only reads with more than four and seven
consecutive repeats, respectively, reducing the probability of capturing
interstitial telomeres in this study. In contrast, Computel decreases
the capture of interstitial telomeres by using reads that align with the
telomeric reference created by the program . For non-model species, it
is possible to use sequence data unmapped to a reference genome to
estimate telomere length, however consideration of potential caveats in
telomere length estimations from interstitial repeats is required.
Additional bioinformatic programs such as TelomeHunter and qmotif allow
telomere repeats to deviate from the typical human telomere repeat,
TTAGGG . Although telomeric repeats are generally conserved within a
species, deviations from the typical telomere repeat have been reported
in humans and are frequently considered into telomere calculations . In
plants, telomere variants that deviate from the Arabidopsis type
(TTTAGGG) are reported between taxa, with some families such as
Alliaceae exhibiting novel telomere sequences, CTCGGTTATGGG . However,
to date there is limited empirical data comparing telomere repeat
variation within species. In the present study, we searched only for the
telomere repeat TTTAGGG previously reported as the Populustelomeric sequence . Despite this, potential Populus telomere
repeat variants can be visually detected through manual inspection of
the Populus trichocarpa reference genome. If telomeric variants
within Populus were excluded, our correlations between WGS and
qPCR may increase. Thus, the identification of intraspecific telomere
repeat variation in plants, coupled with new bioinformatic approaches
that include telomere repeat diversity will improve telomere
estimations.
Telomere length estimates in plants are currently limited to programs
that allow modification of telomere repeat pattern and species-specific
genome features. Telomere repeat pattern is taxa-dependent with most
vertebrates sharing the human telomere repeat pattern, TTAGGG . Multiple
programs listed above, including TelomereHunter, Telseq and Telomerecat,
were created to identify human telomere repeats limiting the repeat
search to the vertebrate telomere type . In addition, telomere estimates
for these programs are performed considering human genome features, such
as number of chromosomes and genome length. Plants have different
telomere repeat patterns, generally TTTAGGG, deviating from the human
telomeric type . To our knowledge, the only program that allows the
modification of telomere repeat patterns and genomic features is
Computel . Computel allows uses species-specific genome features,
including telomere pattern, number of chromosomes, and genome size. The
greatest correlation between WGS and qPCR (r = 0.66) was observed for
Computel. Previous studies indicate that Computel performs similarly to
other bioinformatic approaches , but as the field of telomere ecology
expands increased flexibility to modify the telomere repeat pattern and
include species-specific genome features will be required to extend
applications.
Accurate measurement of telomere length is needed to deploy telomeres as
potential biomarkers to quantify organismal response to abiotic and
biotic stress. Although qPCR has been used extensively due to its
accessibility and opportunities for high throughput analysis, it
provides only a relative measurement rather than an absolute measure of
telomere length Furthermore, qPCR accuracy in assaying telomere length
is susceptible to potential variations in the reference control gene,
primer efficiency, and inter-assay variability . WGS can provide a
high-resolution assessment of the telomeric regions allowing for precise
quantification of absolute telomere length. In addition, WGS allows
detection of mutations within the telomeric regions and permits telomere
length assessment on an individual chromosome basis. Thus, while WGS can
be computationally intensive and potentially cost-prohibitive for
large-scale studies, WGS can enhance the accuracy of current telomere
length methods, particularly for techniques involving subtelomeric
primers or probes, using sequence data .