SIMAP - the database of all-against-all protein sequence similarities and annotations with new interfaces and increased coverage.

Arnold, Roland; Kim Lab, Terrence Donnelly Centre for Cellular and Biomolecular Research University of Toronto; Goldenberg, Florian; Department fur Mikrobiologie und Okosystemforschung, Fakultat fur Lebenswissenschaften University of Vienna; Mewes, Hans-Werner; Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum München Technische Universität München; Rattei, Thomas

Published in

Oxford University Press, Nucleic Acids Research, D1(42), p. D279-D284, 2013

DOI: 10.1093/nar/gkt970

Tools

Export citation

Search in Google Scholar

SIMAP - the database of all-against-all protein sequence similarities and annotations with new interfaces and increased coverage.

Journal article published in 2013 by Roland Arnold, Terrence Donnelly Centre for Cellular and Biomolecular Research University of Toronto Kim Lab, Florian Goldenberg, Fakultat fur Lebenswissenschaften University of Vienna Department fur Mikrobiologie und Okosystemforschung, Hans-Werner Mewes

, Helmholtz Zentrum München Technische Universität München Institute of Bioinformatics and Systems Biology, Thomas Rattei

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

The Similarity Matrix of Proteins (SIMAP, http://mips.gsf.de/simap/) database has been designed to massively accelerate computationally expensive protein sequence analysis tasks in bioinformatics. It provides pre-calculated sequence similarities interconnecting the entire known protein sequence universe, complemented by pre-calculated protein features and domains, similarity clusters and functional annotations. SIMAP covers all major public protein databases as well as many consistently re-annotated metagenomes from different repositories. As of September 2013, SIMAP contains >163 million proteins corresponding to similar to 70 million non-redundant sequences. SIMAP uses the sensitive FASTA search heuristics, the Smith-Waterman alignment algorithm, the InterPro database of protein domain models and the BLAST2GO functional annotation algorithm. SIMAP assists biologists by facilitating the interactive exploration of the protein sequence universe. Web-Service and DAS interfaces allow connecting SIMAP with any other bioinformatic tool and resource. All-against-all protein sequence similarity matrices of project-specific protein collections are generated on request. Recent improvements allow SIMAP to cover the rapidly growing sequenced protein sequence universe. New Web-Service interfaces enhance the connectivity of SIMAP. Novel tools for interactive extraction of protein similarity networks have been added. Open access to SIMAP is provided through the web portal; the portal also contains instructions and links for software access and flat file downloads.

Published in

Links

Tools

SIMAP - the database of all-against-all protein sequence similarities and annotations with new interfaces and increased coverage.

Abstract