3.1 | Alpha-helical transmembrane microproteins
Intergenic regions of eukaryotic genomes are rich in A/T residues
relative to genes, which are G/C rich. When microproteins are expressed
from “noncoding” regions, they therefore tend to contain predicted
transmembrane helices arising from the preponderance of T/U residues
within codons that correspond to hydrophobic and aromatic amino acids.
This intergenic sequence bias therefore affects the amino acid
composition of evolutionarily young, species-specific microproteins,
that arise de novo from previously noncoding regions of the
genome. A recent study demonstrated that C-terminal hydrophobic patches
tend to target evolutionarily young microproteins to the BAG6 membrane
protein triage complex, resulting either in membrane insertion or, if
mislocalized or improperly folded, proteasomal degradation.
Interestingly, species-specific transmembrane microproteins that exhibit
low expression can nonetheless contribute fitness advantage to cells,
and examples have been shown to function in processes such as yeast
mating. Not all membrane-associated microproteins are evolutionarily
novel; a large and growing number of well-characterized, conserved
transmembrane microproteins are predicted to contain transmembrane
helices, such as the lysosomal membrane-localized polypeptide regulator
of mTORC1, SPAR, and the plasma membrane localized micropeptide
Myomixer, which is required for myoblast fusion during skeletal muscle
development. The class of alpha-helical transmembrane microproteins is
therefore large, and of outsize biological importance. We turn our
attention in this section to those membrane-associated microproteins
that have been subjected to experimental structure determination.
AcrZ, previously named YbhT, was reported in a seminal study identifying
unannotated small protein genes in E. coli utilizing
computational tools that incorporate ribosome binding site prediction.
AcrZ is a 49-amino acid microprotein that is conserved in many
Gram-negative bacteria and localizes to the E. coli inner and
outer membranes by virtue of an N-terminal transmembrane helix. AcrZ
binds to the AcrB subunit of the AcrAB/tolC multidrug efflux
pump, increasing the efficiency of transport of (and, thus, resistance
of E. coli to) a subset of its substrates. Multiple structures of
AcrZ in complex with the AcrB homotrimer have been solved, including
crystal structures of detergent-solubilized complexes, as well as a
cryo-electron microscopy (cryo-EM) structure of the complex
reconstituted in lipid discs (Figure 4A). AcrZ binds to a transmembrane
groove within each molecule of AcrB. The cryo-EM structure revealed that
AcrZ exhibits a profound bend between positions 10-15, conferred by a
helix-breaking proline residue. Mutagenesis studies revealed that the
proline is required for interaction of AcrZ with AcrB. At the same time,
proline, or an equally helix-breaking glycine residue, can be moved to
any position within the AcrZ interaction motif while retaining its
association with AcrB. Several of these mutations that retain AcrB
binding also recapitulate the selective drug transport-promoting
phenotype of wild-type AcrZ. While the precise effects of AcrZ binding
on cargo occupancy and transport are not fully clear, allosteric
modulation of binding sites in AcrB is evident by comparing the AcrB vs.
AcrBZ structures. Furthermore, AcrZ promotes cardiolipin association
with AcrB, likely contributing to allosteric modulation of cargo binding
pockets in the transporter. Taken together, these results indicate that
the bend in the transmembrane helical shape of AcrZ, and not its amino
acid sequence, is essential for interaction and modulation of AcrB.
E. coli CydX was originally identified as YbgT, a predicted
37-amino acid microprotein encoded downstream of the cytochrome bd
oxidase operon genes cydA and cydB . Cytochrome bd oxidases
operate as terminal electron acceptors in the electron transport chain
under hypoxic conditions due to their high oxygen affinity. The two
canonical subunits, CydA and CydB, form a pseudosymmetric heterodimer,
of which the CydA subunit contains all three heme residues responsible
for reduction of molecular oxygen to water, as well as the Q loop that
is responsible for binding an electron donor quinol. CydX is a
single-pass alpha helical transmembrane protein that copurifies with the
CydAB complex and is required for the assembly, stability, and/or
activity of cytochrome bd oxidase in multiple species. Several atomic
structures of cytochrome bd oxidases have revealed the role of CydX
homologs in the complex (Figure 4B). First, the presence of an
unannotated, CydX homolog, CydS, was serendipitously discovered in a
crystal structure of cytochrome bd oxidase purified from the
gram-positive bacterium Geobacillus thermodenitrificans . CydS
forms an alpha helix that binds between helices 5 and 6 of CydA, leading
the authors to speculate that it may stabilize the heme cofactor when
the Q loop undergoes dynamic movement during catalysis. A subsequent
cryo-EM structure of the E. coli cytochrome BD oxidase revealed
CydX bound to CydA between helices 1 and 6, again suggesting a
structural role. Interestingly, the E. coli CydAB unexpectedly
revealed the presence of another single-pass transmembrane microprotein,
CydH, which is encoded in the ynhF gene that is not located
within the cytochrome bd oxidase operon. CydH binds between
transmembrane helices 1 and 8 of CydA, on the opposite face of CydA
relative to CydX. CydH is proposed to occlude the proposed
oxygen-conducting channel from the Geobacillus complex structure,
which has been replaced with a hydrophobic channel that traverses CydB
directly to the heme d site. The CydH oxygen channel rearrangement was
proposed to be required due to the swapped positions of two heme
cofactors in the E. coli enzyme relative to theGeobacillus structure, and, accordingly, CydH homologs are found
in Proteobacteria. Overall, cytochrome bd oxidase is a unique system in
which microproteins are required for activity, structure and stability
of a critical complex of proteins.
In another well-characterized example, a class of microproteins (also
called micropeptides) termed “regulins” regulate the activity of the
sarco/endoplasmic reticulum (SR/ER) calcium ATPase (SERCA). During
muscle contraction, including the contraction of the heart and
calcium-dependent signaling processes, calcium is released from the
SR/ER into the cytosol; then, to terminate signaling or contraction,
calcium is pumped back into the SR/ER against its concentration gradient
using the energy of ATP hydrolysis by SERCA. Regulins colocalize with
SERCA in the SR/ER membrane, and each micropeptide is expressed in the
same, specific tissue as the SERCA isoform that it regulates. The first
known regulins, phospholamban and sarcolipin, were identified as
inhibitors of SERCA in cardiac and skeletal muscle, respectively.
Structural analysis of these canonical regulins, both of which are
<100 amino acids, reveals that they are small, single-pass
membrane proteins bearing a single transmembrane alpha-helix. The
crystal structure of the SERCA-sarcolipin complex reveals that the
micropeptide binds in a transmembrane groove in the SERCA channel
between helices 2, 6 and 9, where it allosterically alters the
conformation of SERCA to decrease its apparent calcium affinity.
Phospholamban binds to the same regulatory groove (Figure 4C). One
seminal discovery of novel SERCA regulating micropeptides came from a
study in Drosophila. In this work, Couso and colleagues analyzed
putative lncRNAs associated with polysomes, suggesting that they are
translated. Of these lncRNAs, one contained an sORF encoding a peptide
predicted to be homologous to phospholamban and sarcolipin, which was
accordingly given the name sarcolamban. Sarcolamban may have arisen via
duplication of an ancestral phospholamban/sarcolipin gene in insects,
which subsequently diverged to the sarcolamban sequence. Sarcolamban was
demonstrated to bind SERCA in flies and its deletion caused heart
arrythmias, consistent with a role in regulating SERCA. Docking the
predicted structure of sarcolamban onto SERCA was consistent with a
similar binding mode as that observed for phospholamban and sarcolipin.
Just as importantly, additional novel regulins have also been discovered
in mammals. In analyses of mammalian lncRNAs to identify potential
micropeptides expressed in skeletal muscle and other tissues lacking
known regulin expression, translated sORFs were identified that encode
the novel SERCA binding micropeptides myoregulin, endoregulin, and
another-regulin, all of which bind to the same transmembrane groove of
SERCA, exhibit similar inhibition of SERCA to phospholamban, and are
predicted to have similar single-pass transmembrane alpha-helical
structures. Interestingly, an unannotated, SERCA-activatingmicropeptide, DWORF, was identified in yet another long noncoding RNA in
mouse. DWORF is expressed in skeletal muscle, and ectopic
over-expression of DWORF in heart tissue enhances contractility and
reverses heart failure in a model of heart failure. However, the
mechanism by which DWORF activates SERCA was unclear, since it is
predicted to bear a similar alpha-helical transmembrane domain and binds
to the same SERCA groove as previously characterized regulins, which are
all inhibitory. Some evidence from fluorescence resonance energy
transfer suggests that DWORF binding can directly activate SERCA. A
recent NMR structural study demonstrated that the alpha helix of DWORF
is kinked at a unique proline residue, creating a significant bend in
the transmembrane region without disrupting its binding to SERCA (Figure
4C). Mutating this proline residue diminished the bend angle between the
two alpha helical regions of DWORF, and not only prevented its
activation of SERCA, but converted it into a SERCA inhibitor. Therefore,
activation of SERCA by DWORF appears to require its proline-induced
kink, and, by extension, inhibition of SERCA by phospholamban,
sarcolipin, myoregulin, endoregulin and another-regulin may be
hypothesized to require binding of their uninterrupted transmembrane
helices to the regulatory groove of SERCA. It is also fascinating to
note the parallels between DWORF and AcrZ (see above), both of which
utilize kinked transmembrane alpha-helices to allosterically regulate
the membrane transporters SERCA and AcrB, respectively.