HPC-CLUST: distributed hierarchical clustering for large sets of nucleotide sequences

Matias Rodrigues, João F.; von Mering, Christian

Published in

Oxford University Press (OUP), Bioinformatics, 2(30), p. 287-288

DOI: 10.1093/bioinformatics/btt657

Tools

Export citation

Search in Google Scholar

HPC-CLUST: distributed hierarchical clustering for large sets of nucleotide sequences

Journal article published in 2013 by João F. Matias Rodrigues

, Christian von Mering

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Motivation: Nucleotide sequence data are being produced at an ever increasing rate. Clustering such sequences by similarity is often an essential first step in their analysis—intended to reduce redundancy, define gene families or suggest taxonomic units. Exact clustering algorithms, such as hierarchical clustering, scale relatively poorly in terms of run time and memory usage, yet they are desirable because heuristic shortcuts taken during clustering might have unintended consequences in later analysis steps.

Published in

Links

Tools

HPC-CLUST: distributed hierarchical clustering for large sets of nucleotide sequences

Abstract