Profile HMMs
A profile hidden Markov model (HMM)31 was derived for
each expansin domain from a multiple sequence alignment built from
twenty-eight representative protein sequences, including twenty-two of
the twenty-five seed sequences mentioned above, two fungal sequences,
and four sequences for which their structure was known (Table
S2 ). To determine the region of the two domains in a multiple sequence
alignment, four crystal structures of expansins were superimposed (PDB
entries 1n10, chain A; 2hcz, chain X; 4fer, chain B; and 4jjo, chain A).
The structure-based multiple sequence alignment (Figure S1 ) was
generated by the Clustal Omega package34 (version
1.2.1-1) and STAMP35 (version 4.4), and visualized by
PyMOL36 (version 4.60, Schrödinger, New York, NY,
USA). Based on the structural alignment and on annotations of secondary
structures in Pfam37 (entries PF03330.17 for the
N-terminal domain and PF01357.20 for the C-terminal domain), the
respective domains were manually retrieved. The individual profile HMMs
for the N- and C-terminal expansin domains were built by HMMER from the
multiple sequence alignments. The input multiple sequence alignments
were aligned against the derived output profile HMMs with thehmmalign command from HMMER in order to determine whether there
are shifts between the input and output alignments. Shifted alignment
columns were refined manually with respect to the positions of known
secondary structure elements. The refined profile HMMs of the N- and
C-terminal expansin domain comprise 95 and 75 positions, respectively
(Figures S2 and S3 ), and are available together with
their underlying alignments at https://doi.org/10.18419/darus-623.