Orthology Detection Combining Clustering and Synteny for Very Large Datasets

Lechner, Marcus; Institut für Pharmazeutische Chemie, Philipps-Universität Marburg; Hernandez-Rosales, Maribel; Bioinformatics Group, Department of Computer Science Universität Leipzig; Doerr, Daniel; Genome Informatics, Faculty of Technology Bielefeld University; Wieseke, Nicolas; Faculty of Mathematics and Computer Science, University of Leipzig; Thévenin, Annelyse; Stoye, Jens; Hartmann, Roland K.; Prohaska, Sonja J.; Computational EvoDevo Group, Department of Computer Science Universität Leipzig; Stadler, Peter F.; Institut fur Theoretische Chemie, Fakultat fur Chemie University of Vienna

Published in

Public Library of Science, PLoS ONE, 8(9), p. e105015, 2014

DOI: 10.1371/journal.pone.0105015

Tools

Export citation

Search in Google Scholar

Orthology Detection Combining Clustering and Synteny for Very Large Datasets

Journal article published in 2014 by Marcus Lechner, Philipps-Universität Marburg Institut für Pharmazeutische Chemie, Maribel Hernandez-Rosales, Department of Computer Science Universität Leipzig Bioinformatics Group, Daniel Doerr, Faculty of Technology Bielefeld University Genome Informatics, Nicolas Wieseke, University of Leipzig Faculty of Mathematics and Computer Science, Annelyse Thévenin, Jens Stoye, Roland K. Hartmann, Sonja J. Prohaska, Department of Computer Science Universität Leipzig Computational EvoDevo Group, Peter F. Stadler

, Fakultat fur Chemie University of Vienna Institut fur Theoretische Chemie

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets.

Published in

Links

Tools

Orthology Detection Combining Clustering and Synteny for Very Large Datasets

Abstract