Introduction
Expansins are plant cell wall loosening proteins without apparent
catalytic activity, which have been identified in a broad range of
organisms1–4. The loosening mechanism is still
elusive, but it has been suggested that the non-covalent interactions
between cellulose microfibrils are weakened and moved against each
other, thus the tight cellulosic structure is
loosened1. The interactions between expansins and the
plant cell wall, which consists of lignin, hemicellulose, and cellulose,
require further investigation5. Expansins were first
discovered in plants and were described as proteins mediating
pH-dependent extension and stress relaxation of cell
walls6. Based on phylogenetic analysis, it has been
proposed that expansins in Bacteria and Fungi resulted from
multiple horizontal gene transfers from plants to
microbes7, but there is also the possibility that the
microbial expansin subfamily evolved first in ancient marine
microorganisms, and then diversified into distinct terrestrial plant
subfamilies8.
Expansins consist of two tightly packed protein domains, connected by a
short linker and preceded by a signal peptide9(Figure 1) . Both expansin domains need to be connected for
effective wall extension activity and weakening filter
paper10,11. The C-terminal domain of EXLX1
(expansin-like X) from Bacillus subtilis dominates the binding to
cellulose and to matrix polysaccharides of cell walls through
electrostatic or polar interaction10. The Zea
mays β-expansin (Zm EXPB1) primarily binds glucuronoarabinoxylan,
the major matrix polysaccharide in grass cell walls, and loosens
it12.
Key amino acids in the N-terminal domain of Bacillus subtilisexpansin-like protein 1 (Bs EXLX1) are two threonines at positions
12 and 14, a serine at position 16, two aspartates at positions 71 and
82, a tyrosine at position 73, and a glutamic acid at position
7510, numbered according to13. The
threonine at standard position 12 is strongly conserved, but not
essential for activity10. The aspartate at position 82
is crucial for activity; the threonine at position 14, the aspartate at
position 71, and the tyrosine at position 73 are important for activity;
and the serine at position 16 and the glutamic acid at position 75 play
moderate roles in wall creep activity10. Three
disulfide bridges can be found in the N-terminal domain ofZm EXPB114, and the six participating cysteines
are highly conserved in the plant expansin groups, EXPA (expansin A) and
EXPB (expansin B)14. An additional highly conserved
cysteine pair is considered as a fourth disulfide bridge in plant
α-expansins15. In the expansin protein Sc Exlx1
from the Basidiomycete fungus Schizophylum commune , three
disulfide bonds are predicted16, whereas there is a
lack of disulfide bridges in Bs EXLX113 and many
other bacterial expansins.
The N-terminal expansin domain is formed by a six-stranded double-Ѱ
β-barrel13 that is shared by several protein
superfamilies17, e.g. glycoside hydrolase family 45
(GH45)18,19. The expansin-like proteins found in Fungi
such as loosenins, EXPNs, or cerato-platanins are single-domain proteins
that resemble the N-terminal domain of
expansins20–22.
The C-terminal expansin domain is responsible for the binding to
cellulosic material and is formed by two stacked β-sheets with an
immunoglobulin-like fold1. The cellulose binding site
on the protein surface consists of a linear arrangement of aromatic
residues (tyrosines, phenylalanines, and
tryptophans)13, which for Bs EXLX1 includes two
tryptophans at positions 125 and 126, and a tyrosine at position
15710. A further key amino acid residue required for
wall extension activity is a lysine at position 11910.
The C-terminal domain of Bs EXLX1 belongs to family 63 of
carbohydrate binding modules (CBM63)10, which mediate
binding to polysaccharides23,24.
In this paper, we analyzed the similarity between “expansin-like
proteins” (such as GH45s, loosenins, swollenins, cerato-platanins,
EXPNs, and expansin-like proteins found in nematodes) and expansin
domains on sequence level by establishing the Expansin Engineering
Database (ExED), which collects characterized and putative expansin
homologues. The protein sequences in the ExED were divided into
different superfamilies (‘Bacterial expansins’, ‘Fungal expansins’, and
‘Plant expansins’) according to sequence identity, and not by
phylogenetic relationships of expansins, which were analyzed
in8. By annotating the two expansin domains and using
a continuous standard numbering scheme, conserved sequence motifs of the
expansin protein family were identified that could be applied in the
screening of genomic data for the identification of novel expansins.