Published in

Elsevier, Genomics, 2(46), p. 200-216

DOI: 10.1006/geno.1997.4989

Links

Tools

Export citation

Search in Google Scholar

Analysis of Protein Domain Families inCaenorhabditis elegans

Journal article published in 1997 by Erik L. L. Sonnhammer, Richard Durbin ORCID
This paper was not found in any repository, but could be made available legally by the author.
This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

TheCaenorhabditis elegansgenome sequencing project has completed over half of this nematode's 100-Mb genome. Proteins predicted in the finished sequence have been compiled and released in the database Wormpep. Presented here is a comprehensive analysis of protein domain families in Wormpep 11, which comprises 7299 proteins. The relative abundance of common protein domain families was counted by comparing all Wormpep proteins to the Pfam collection of protein families, which is based on recognition by hidden Markov models. This analysis also identified a number of previously unannotated domains. To investigate new apparently nematode-specific protein families, Wormpep was clustered into domain families on the basis of sequence similarity using the Domainer program. The largest clusters that lacked clear homology to proteins outside Nematoda were analyzed in further detail, after which some could be assigned a putative function. We compared all proteins in Wormpep 11 to proteins in the human,Saccharomyces cerevisiae,andHaemophilus influenzaegenomes. Among the results are the estimation that over two-thirds of the currently known human proteins are likely to have a homologue in the wholeC. elegansgenome and that a significant number of proteins are well conserved betweenC. elegansandH. influenzae,that are not found inS. cerevisiae.