reGenotyper: Detecting mislabeled samples in genetic data

Zych, Konrad; Snoek, Basten L.; Elvin, Mark; Rodriguez, Miriam; Velde, K. J. Van Der; Arends, Danny; Westra, Harm-Jan; Swertz, Morris A.; Poulin, Gino; Kammenga, Jan E.; Breitling, Rainer; Jansen, Ritsert C.; Li, Yang; Rutherford, Suzannah

Published in

Public Library of Science, PLoS ONE, 2(12), p. e0171324, 2017

DOI: 10.1371/journal.pone.0171324

Tools

Export citation

Search in Google Scholar

reGenotyper: Detecting mislabeled samples in genetic data

Journal article published in 2017 by Konrad Zych

, Basten L. Snoek, Mark Elvin, Miriam Rodriguez, K. J. Van Der Velde

, Danny Arends, Harm-Jan Westra

, Morris A. Swertz, Gino Poulin

, Jan E. Kammenga, Rainer Breitling

, Ritsert C. Jansen

, Yang Li, Suzannah Rutherford

This paper is available in a repository.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

In high-throughput molecular profiling studies, genotype labels can be wrongly assigned at various experimental steps; the resulting mislabeled samples seriously reduce the power to detect the genetic basis of phenotypic variation. We have developed an approach to detect potential mislabeling, recover the “ideal” genotype and identify “best-matched” labels for mislabeled samples. On average, we identified 4% of samples as mislabeled in eight published datasets, highlighting the necessity of applying a “data cleaning” step before standard data analysis.

Published in

Links

Tools

reGenotyper: Detecting mislabeled samples in genetic data

Abstract