3.4 | Microproteins with predicted structures
With the advent of three-dimensional macromolecular structure prediction tools such as Rosetta, iTasser, Phyre, and, most recently, AlphaFold, many recently discovered, now-annotated microproteins have been subjected to computational structure prediction, and these structural models are publicly available. For microproteins that remain unannotated, computational tools can be used to generate testable structural predictions. For example, analysis of the recently identifiedE. coli cold-shock microprotein YmcF using iTasser led to the hypothesis that YmcF may adopt a folded structure consisting of an alpha helix and 2-3 beta strands separated by a turn, homologous to a zinc-binding domain of aspartate transcarbamoylase (Figure 4G). While no functional data for YmcF yet exists, this predicted structural model, if correct, may have implications in the cold shock response, which requires RNA binding proteins—some of which coordinate zinc—to chaperone RNA secondary structures that become hyper-stable at low temperature. In another example, plant microProteins are specifically defined as proteins predicted to fold into single domains that bind to and generally antagonize the functions of their effectors, such as transcription factors.
Predicted structures of microproteins have already begun to aid in determining their molecular and cellular functions. A translated upstream ORF (uORF) encoding a 96-amino acid microprotein within the 5′ untranslated region (UTR) of the human ASNSD1 gene was reported by Oyama et al. in 2007 and in subsequent proteomic analyses, leading to the annotation of the microprotein as ASDURF (ASNSD1 upstream open reading frame). As discussed above, evidence is accumulating that uORF microproteins can function in trans. Remarkably, Coulombe and colleagues recently implicated ASDURF as the “missing” subunit of a chaperone complex termed the PAQosome. Proximity biotinylation and pull-down experiments with PAQosome subunits revealed ASDURF as an interaction partner, and in vitro reconstitution assays suggested that it is an integral member of a PAQosome subcomplex. The PAQosome is a recently discovered chaperone that is essential for assembling complicated macromolecular complexes in the cell, including RNA polymerases, components of the spliceosome, and protein phosphatases. The PAQosome consists of two modules, one of which is termed the prefoldin-like (PFDL) module. The PFDL module shares some subunits and putative structural homology to prefoldin, another cellular chaperone required for folding cytoskeletal proteins and other clients. Prefoldin and the PFDL module are both hexameric, consisting of three alpha- and three beta-prefoldin subunits, which both contain an alpha-helical coiled-coil separated by either one (beta) or two (alpha) hairpins; however, only five of the six PFDL subunits (three alpha and two beta) had been identified. Tertiary structure modeling with Phyre suggested that ASDURF is a beta-prefoldin bearing a single beta hairpin and coiled-coil (Figure 4H), consistent with its potential identification as the undiscovered beta subunit of the PFDL module of the PAQosome – suggesting it had been missed because it was not part of the proteome annotation at the time of the PAQosome’s discovery. Many additional interesting questions are raised by the ASDURF microprotein: Why is it encoded in an upstream ORF within the ASNSD1 gene? Does its 5′ UTR location confer stress responsiveness via translational regulation, as suggested by Cloutier et al.? Is its function or regulation related to the downstream ASNSD1 protein, per the model of Chen, Weissman and colleagues that co-encoded microproteins and proteins tend to function in the same pathways? Regardless, while the structural model requires experimental validation, it appears that ASDURF is a particularly compelling example of a microprotein for which structure prediction informs its interactions and likely function.