Dissemin is shutting down on January 1st, 2025

Published in

Wiley, Molecular Ecology Resources, 2023

DOI: 10.1111/1755-0998.13859

Links

Tools

Export citation

Search in Google Scholar

Scaling‐up RADseq methods for large datasets of non‐invasive samples: Lessons for library construction and data preprocessing

This paper was not found in any repository, but could be made available legally by the author.
This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

AbstractGenetic non‐invasive sampling (gNIS) is a critical tool for population genetics studies, supporting conservation efforts while imposing minimal impacts on wildlife. However, gNIS often presents variable levels of DNA degradation and non‐endogenous contamination, which can incur considerable processing costs. Furthermore, the use of restriction‐site‐associated DNA sequencing methods (RADseq) for assessing thousands of genetic markers introduces the challenge of obtaining large sets of shared loci with similar coverage across multiple individuals. Here, we present an approach to handling large‐scale gNIS‐based datasets using data from the spotted hyena population inhabiting the Ngorongoro Crater in Tanzania. We generated 3RADseq data for more than a thousand individuals, mostly from faecal mucus samples collected non‐invasively and varying in DNA degradation and contamination level. Using small‐scale sequencing, we screened samples for endogenous DNA content, removed highly contaminated samples, confirmed overlap fragment length between libraries, and balanced individual representation in a sequencing pool. We evaluated the impact of (1) DNA degradation and contamination of non‐invasive samples, (2) PCR duplicates and (3) different SNP filters on genotype accuracy based on Mendelian error estimated for parent–offspring trio datasets. Our results showed that when balanced for sequencing depth, contaminated samples presented similar genotype error rates to those of non‐contaminated samples. We also showed that PCR duplicates and different SNP filters impact genotype accuracy. In summary, we showed the potential of using gNIS for large‐scale genetic monitoring based on SNPs and demonstrated how to improve control over library preparation by using a weighted re‐pooling strategy that considers the endogenous DNA content.