USING PharmGKB TO TRAIN TEXT MINING APPROACHES FOR IDENTIFYING POTENTIAL GENE TARGETS FOR PHARMACOGENOMIC STUDIES

Pakhomov, S.; Mcinnes, B. T.; Lamba, J.; Liu, Y.; Melton, G. B.; Ghodke, Y.; Bhise, N.; Lamba, V.; Birnbaum, A. K.

Published in

Elsevier, Journal of Biomedical Informatics, 5(45), p. 862-869, 2012

DOI: 10.1016/j.jbi.2012.04.007

Tools

Export citation

Search in Google Scholar

USING PharmGKB TO TRAIN TEXT MINING APPROACHES FOR IDENTIFYING POTENTIAL GENE TARGETS FOR PHARMACOGENOMIC STUDIES

Journal article published in 2012 by S. Pakhomov, B. T. Mcinnes, J. Lamba, Y. Liu, G. B. Melton

, Y. Ghodke, N. Bhise, V. Lamba, A. K. Birnbaum

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving restricted

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

The main objective of this study was to investigate the feasibility of using PharmGKB, a pharmacogenomic database, as a source of training data in combination with text of MEDLINE abstracts for a text mining approach to identification of potential gene targets for pathway-driven pharmacogenomics research. We used the manually curated relations between drugs and genes in PharmGKB database to train a support vector machine predictive model and applied this model prospectively to MEDLINE abstracts. The gene targets suggested by this approach were subsequently manually reviewed. Our quantitative analysis showed that a support vector machine classifiers trained on MEDLINE abstracts with single words (unigrams) used as features and PharmGKB relations used for supervision, achieve an overall sensitivity of 85% and specificity of 69%. The subsequent qualitative analysis showed that gene targets “suggested” by the automatic classifier were not anticipated by expert reviewers but were subsequently found to be relevant to the three drugs that were investigated: carbamazepine, lamivudine and zidovudine. Our results show that this approach is not only feasible but may also find new gene targets not identifiable by other methods thus making it a valuable tool for pathway-driven pharmacogenomics research.

Published in

Links

Tools

USING PharmGKB TO TRAIN TEXT MINING APPROACHES FOR IDENTIFYING POTENTIAL GENE TARGETS FOR PHARMACOGENOMIC STUDIES

Abstract