Profile HMMs
A profile hidden Markov model (HMM)31 was derived for each expansin domain from a multiple sequence alignment built from twenty-eight representative protein sequences, including twenty-two of the twenty-five seed sequences mentioned above, two fungal sequences, and four sequences for which their structure was known (Table S2 ). To determine the region of the two domains in a multiple sequence alignment, four crystal structures of expansins were superimposed (PDB entries 1n10, chain A; 2hcz, chain X; 4fer, chain B; and 4jjo, chain A). The structure-based multiple sequence alignment (Figure S1 ) was generated by the Clustal Omega package34 (version 1.2.1-1) and STAMP35 (version 4.4), and visualized by PyMOL36 (version 4.60, Schrödinger, New York, NY, USA). Based on the structural alignment and on annotations of secondary structures in Pfam37 (entries PF03330.17 for the N-terminal domain and PF01357.20 for the C-terminal domain), the respective domains were manually retrieved. The individual profile HMMs for the N- and C-terminal expansin domains were built by HMMER from the multiple sequence alignments. The input multiple sequence alignments were aligned against the derived output profile HMMs with thehmmalign command from HMMER in order to determine whether there are shifts between the input and output alignments. Shifted alignment columns were refined manually with respect to the positions of known secondary structure elements. The refined profile HMMs of the N- and C-terminal expansin domain comprise 95 and 75 positions, respectively (Figures S2 and S3 ), and are available together with their underlying alignments at https://doi.org/10.18419/darus-623.