Annotation of expansin domains in actinobacterial genomes
As a case study for the application of ExED in genome sequence
annotation, actinobacterial genomes from various South African habitats
were analyzed for the presence of expansin domains and conserved amino
acid positions, using the profile HMMs of the expansin domains
(Tables S8 and S9 ). In general, the sequence regions
identified for the N-terminal expansin domains emerged with higher HMMER
scores, whereas the C-terminal domains seemed less conserved (compare
with Figure 3 ). Despite the lower scores for the C-terminal
expansin domain, the coverage for the underlying profile HMM was still
high (90%). One genome hit was identified in sediment samples collected
at Gamka River in the Swartberg Mountain Range, which was identical to
an expansin homologue from Streptomyces swartbergensis (NCBI
accession WP_086602418), which matched well the profile HMM of the
N-terminal expansin domain (score: 60, 98% coverage) and moderately the
profile HMM of the C-terminal expansin domain (score: 19, 89%
coverage). The sequence from S. swartbergensis contains amino
acids that are conserved in the superfamily ‘Bacterial expansins’
(threonine 12, glycine 21, alanine 36, glycine 53, tyrosine 55, proline
74, aspartate 82, leucine 83, phenylalanine 88, and glycine 97 in the
N-terminal expansin domain; lysine 119, tryptophan 126, tryptophan 149,
tyrosine 157, and glycine 179 in the C-terminal expansin domain) and
also amino acids that are conserved in the superfamilies ‘Fungal
expansins’ or ‘Plant expansins’ (tyrosine 14, cysteine 23, cysteine 52,
and cysteine 73).