The experimentally determined structure presented a complex α/β fold consisting of 9 long and 3 short helices and two independent, antiparallel, twisted beta sheets with the 132 topology, shown in Figure 2 in mauve and red, respectively, together with a conserved helix inserted between the first and second strand. The second beta sheet has an additional short beta strand (b4), so its topology may be also described as 1243.
Structure comparison using the Dali [15] and FATCAT [16] algorithms confirmed the presence of an internal repeat (Figure 2 and 3) and its similarity to the LPP20 proteins. Database searches among other PDB structures identified hundreds of proteins with statistically significant structural similarity to the structure of FTT_1539. The top of the list contained several secreted or cell surface proteins from human pathogens such as HP1454 (PDB ID:4KZS), HP1456 (PDB ID: 5OK8), Tip-αN34 (PDB ID: 2WCQ) and Tip-αN25 (PDB ID: 3VNC) fromHelicobacter pylori and calcium dodecin (Rv0397) fromMycobacterium tuberculosis (PDB ID: 3ONR). More distant similarities to other proteins from the SHS2 Pfam clan were also found in the search. The similarity to the Tip-α proteins is especially interesting, as they are known to induce human TNF-α and have carcinogenic effects [17].
The motif repeated in the FTT_1539 protein, consisting of the three-stranded antiparallel beta sheets, with a helix inserted between the first and the second beta strand, was previously identified in the bacterial ATPase FtsA (Pfam family PF14450), the archaea-eukaryotic RNA polymerase subunit Rpb7p (Pfam family PF03876), the GyrI-like small molecule binding domain (Pfam family PF06445), and the archease protein family MTH1598/Tm1083-like (Pfam family PF01951) [18], together forming an SHS2 clan. Interestingly, proteins from the GyrI family also contain a tandem repeat of this motif. The N-terminus half of the tandem repeat in the FTT_1539 protein is more similar to the “classical” SHS2 motif. We hypothesize that after duplication, one repeat, the N-terminal one, retained its structure (and possibly function) while the second repeat located at the C-terminus diverged and possibly lost its function.
The protein with the structure most similar to that of FTT_1539 identified by the FATCAT algorithm was the Helicobacter pyloriprotein HP1454 (PDB ID:4KZS) [19], with an RMSD (Root Mean Square Deviation) of 2.58 Å over the length of 157 amino acids in the pairwise structural alignment. The protein HP1454 contains 3 distinct domains, but the region similar to the structure of FTT_1539 consists of the N-terminal Domain I containing a classical SHS2 motif – a three-stranded antiparallel β-sheet with a single α-helix inserted between the first and second beta strand [19]. This SHS2 motif is highlighted in Figure 3. The N-terminal domain of HP1454 is extracellular and has structural and potential functional similarities to Tip-α proteins, which are classified as carcinogenic factors. Although the significance of this functional similarity is still unclear, it has been suggested that this motif is involved in protein-protein interactions due to its cellular localization [19]. The second most similar structure belongs to the H. pyloriprotein LPP20 (HP1456) (PDB ID: 5OK8), with an RMSD of 3.03 Å over the length of 125 amino acids [20]. The LPP20 protein was the founding member of the Pfam LPP20 protein family which was initially characterized as a non-essential class of lipoproteins [21]. Other members of this family are virulence factors that are bound to the outer membrane of the bacteria and secreted and transported via vesicles [20]. H. pylori LPP20 was also found to play a role in cancer suppression by reducing the expression of E-cadherin in gastric cancer cells [20].