Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks

Meng, Yan; Yang, Qiong; Cuenco, Karen T.; Adrienne Cupples, L.; Cupples, L. Adrienne; DeStefano, Anita L.; Lunetta, Kathryn L.

Published in

BioMed Central, BMC Proceedings, S1(1), 2007

DOI: 10.1186/1753-6561-1-s1-s56

Tools

Export citation

Search in Google Scholar

Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks

Journal article published in 2007 by Yan Meng, Qiong Yang, Karen T. Cuenco, L. Adrienne Cupples, L. Adrienne Cupples

, Anita L. DeStefano, Kathryn L. Lunetta

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

We used the simulated data set from Genetic Analysis Workshop 15 Problem 3 to assess a two-stage approach for identifying single-nucleotide polymorphisms (SNPs) associated with rheumatoid arthritis (RA). In the first stage, we used random forests (RF) to screen large amounts of genetic data using the variable importance measure, which takes into account SNP interaction effects as well as main effects without requiring model specification. We used the simulated 9187 SNPs mimicking a 10 K SNP chip, along with covariates DR (the simulated DRB1 gentoype), smoking, and sex as input to the RF analyses with a training set consisting of 750 unrelated RA cases and 750 controls. We used an iterative RF screening procedure to identify a smaller set of variables for further analysis. In the second stage, we used the software program CaMML for producing Bayesian networks, and developed complex etiologic models for RA risk using the variables identified by our RF screening procedure. We evaluated the performance of this method using independent test data sets for up to 100 replicates.

Published in

Links

Tools

Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks

Abstract