Published in

De Gruyter, Statistical Applications in Genetics and Molecular Biology, 1(6)

DOI: 10.2202/1544-6115.1251

Links

Tools

Export citation

Search in Google Scholar

Using Duplicate Genotyped Data in Genetic Analyses: Testing Association and Estimating Error Rates

Journal article published in 2007 by Nathan L. Tintle, Derek Gordon, Francis J. McMahon ORCID, Stephen J. Finch
Distributing this paper is prohibited by the publisher
Distributing this paper is prohibited by the publisher

Full text: Unavailable

Red circle
Preprint: archiving forbidden
Red circle
Postprint: archiving forbidden
Orange circle
Published version: archiving restricted
Data provided by SHERPA/RoMEO

Abstract

Although researchers use duplicate genotyped data to calculate an inconsistency rate, there is no power analysis to assess the value of the duplicate data. In this paper, we present a model in which the genotyping error rate is related to the inconsistency rate. We extend the g genotype by h phenotype chi-squared test to incorporate the duplicate genotyped data. When a subject is inconsistently genotyped (that is, has two observed genotypes), our procedure is to allocate 0.5 units to each of the two genotypes. We specify the multivariate analysis of variance (MANOVA) test comparing these extended counts. We provide freely available software for this test and also for a permutation test used on small samples. A simulation study shows that the asymptotic null distribution of the MANOVA test holds when the total number of subjects, N, is at least 300. We also document with a simulation study that the asymptotic distribution of this test under various alternative hypotheses is a satisfactory approximation to the simulated power. In all cases, the power of the MANOVA test using the duplicate genotyped data is greater than the power of the chi-squared test ignoring the duplicate data. Power increases ranged from 0.776% to 4.652% for 80% powered tests and 0.292% to 2.028% for 95% powered tests. Researchers now can compute the value of the duplicate genotyped data as part of the design of the study.