Francisella tularensis protein FTT_1539 defines a large novel protein family, with at least 532 homologs identified in the NCBI refseq database [11] with an iterative PSI-BLAST [12] search. Most of its homologs were found within the Francisella genus, with more distant ones found among several human pathogens, especially pathogens of the gastrointestinal tract such as Vibrio parahaemolyticus ,Salmonella enterica or Campylobacter jejuni. None of the homologs were experimentally characterized and most bore the name “hypothetical protein” suggesting the need for further research on this protein family. Sequence similarity within the Francisellagenus was high (over 50% sequence identity) and it dropped to around 20% for the homologs from other genera, suggesting a genus specific expansion in Francisella species.
Analysis with the distant homology prediction algorithms FFAS [13] and HHpred [14] suggested that the C-terminus of FTT_1539 may contain a domain that could be a distant homolog of the Pfam LPP20 lipoprotein family (PF02169.16) such as Lpp20 (HP1456) fromHelicobacter pylori (PDB code 5OK8). In addition, FFAS dot plot analysis (see Figure 1) also suggested the existence of an internal repeat with the C-terminal LPP20-like domain showing weak sequence similarity to the N-terminal part of the FTT_1539 protein.