Abstract | The gelsolin homology (GH) domain has been found to date exclusively in actin-binding proteins. In humans, three copies of the domain are present in CapG, five copies in supervillin, and six copies each in adseverin, gelsolin, flightless I and the villins: villin, advillin and villin-like protein. Caenorhabditis elegans contains a four-GH-domain protein, GSNL-1. These architectures are predicted to have arisen from gene triplication followed by gene duplication to result in the six-domain protein. The subsequent loss of one, two or three domains produced the five-, four-, and three-domain proteins, respectively. Here we conducted BLAST and hidden Markov based searches of UniProt and NCBI databases to identify novel gelsolin domain containing proteins. The variety in architectures suggests that the GH domain has been tested in many molecular constructions during evolution. Of particular note is flightless-like I protein (FLIIL1) from Entamoeba histolytica, which combines a leucine rich repeats (LRR) domain, seven GH domains, and a headpiece domain, thus combining many of the features of flightless I with those of villin or supervillin. As such, the GH domain superfamily appears to have developed along complex routes. The distribution of these proteins was analyzed in the 343 completely sequenced genomes, mapped onto the tree of life, and phylogenetic trees of the proteins were constructed to gain insight into their evolution.
|