Published in

Humana Press, Methods in Molecular Biology, p. 141-155, 2011

DOI: 10.1007/978-1-61779-040-9_10

Links

Tools

Export citation

Search in Google Scholar

A Bioinformatics Pipeline for Sequence-Based Analyses of Fungal Biodiversity

Journal article published in 2011 by D. Lee Taylor ORCID, Shawn Houston
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

The internal transcribed spacer (ITS) is the locus of choice with which to characterize fungal diversity in environmental samples. However, methods to analyze ITS datasets have lagged behind the capacity to generate large amounts of sequence information. Here, we describe our bioinformatics pipeline to process large fungal ITS sequence datasets, from raw chromatograms to a spreadsheet of operational taxonomic unit (OTU) abundances across samples. Steps include assembling of reads originating from one clone, identifying primer "barcodes" or "tags," trimming vectors and primers, marking low-quality base calls and removing low-quality sequences, orienting sequences, extracting the ITS region from longer amplicons, and grouping sequences into OTUs. We expect that the principles and tools presented here are relevant to datasets arising from ever-evolving new technologies.