Comparison of linear and nonlinear classification algorithms for the prediction of drug and chemical metabolism by human UDP-glucuronosyltransferase isoforms

Mj, Sorich; Sorich, Michael J.; Jo, Miners; Miners, John O.; Winkler, David A.; McKinnon, Ross A.; Ra, McKinnon; Da, Winkler; Smith, Paul A.; Burden, Frank R.; Fr, Burden; Pa, Smith

Published in

Wiley-VCH Verlag, ChemInform, 5(35), 2004

DOI: 10.1002/chin.200405242

American Chemical Society, Journal of Chemical Information and Modeling, 6(43), p. 2019-2024

DOI: 10.1021/ci034108k

Tools

Export citation

Search in Google Scholar

Comparison of linear and nonlinear classification algorithms for the prediction of drug and chemical metabolism by human UDP-glucuronosyltransferase isoforms

Journal article published in 2003 by Sorich Mj, Michael J. Sorich, Miners Jo, John O. Miners, David A. Winkler

, Ross A. McKinnon

, McKinnon Ra, Winkler Da, Paul A. Smith, Frank R. Burden, Burden Fr, Smith Pa

This paper is available in a repository.

Full text: Download

Preprint: archiving forbidden

Postprint: archiving restricted

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Partial least squares discriminant analysis (PLSDA), Bayesian regularized artificial neural network (BRANN), and support vector machine (SVM) methodologies were compared by their ability to classify substrates and nonsubstrates of 12 isoforms of human UDP-glucuronosyltransferase (UGT), an enzyme "superfamily" involved in the metabolism of drugs, nondrug xenobiotics, and endogenous compounds. Simple two-dimensional descriptors were used to capture chemical information. For each data set, 70% of the data were used for training, and the remainder were used to assess the generalization performance. In general, the SVM methodology was able to produce models with the best predictive performance, followed by BRANN and then PLSDA. However, a small number of data sets showed either equivalent or better predictability using PLSDA, which may indicate relatively linear relationships in these data sets. All SVM models showed predictive ability (>60% of test set predicted correctly) and five out of the 12 test sets showed excellent prediction (>80% prediction accuracy). These models represent the first use of pattern recognition methods to discriminate between substrates and nonsubstrates of human drug metabolizing enzymes and the first thorough assessment of three classification algorithms using multiple metabolic data sets.

Published in

Links

Tools

Comparison of linear and nonlinear classification algorithms for the prediction of drug and chemical metabolism by human UDP-glucuronosyltransferase isoforms

Abstract