Full text: Download
Diatoms (Bacillariophyta) are ubiquitous microalgae, which present a huge taxonomic diversity, changing in correlation with differing environmental conditions. This makes them excellent ecological indicators for various ecosystems and ecological problematics (ecotoxicology, biomonitoring, paleo-environmental reconstruction …). Current standardized methodologies for diatoms are based on microscopic determinations, which is time consuming and prone to identification uncertainties. DNA metabarcoding has been proposed as a way to avoid these flaws, enabling the sequencing of a large quantity of barcodes from natural samples. A taxonomic identity is given to these barcodes by comparing their sequences to a barcoding reference library. However, to identify environmental sequences correctly, the reference database should contain a representative number of reference sequences to ensure a good coverage of diatom diversity. Moreover, the reference database needs to be carefully taxonomically curated by experts, as its content has an obvious impact on species detection. Diat.barcode is an open-access library for diatoms linking diatom taxonomic identities to rbcL barcode sequences (a chloroplast marker suitable for species-level identification of diatoms), which has been maintained since 2012. Data are accumulated from three sources: (1) the NCBI nucleotide database, (2) unpublished sequencing data of culture collections and more recently (3) environmental sequences. Since 2017, an international network of experts in diatom taxonomy curate this library. The last version of the database (version 9.2), includes 8066 entries that correspond to more than 280 different genera and 1490 different species. In addition to the taxonomic information, morphological features (e.g. biovolumes, chloroplasts, etc.), life-forms (mobility, colony-type) and ecological features (taxa preferences to pollution) are given. The database can be downloaded from the website (www6.inrae.fr/carrtel-collection/Barcoding-database/) or directly through the R package diatbarcode. Ready-to-use files for commonly used metabarcoding pipelines (Mothur and DADA2) are also available.