Elsevier, Carbohydrate Research, 7(346), p. 960-972
DOI: 10.1016/j.carres.2011.02.017
Full text: Download
A machine learning approach was explored for the prediction of the anomeric configuration, residues, and type of linkages of disaccharides using (13)C NMR chemical shifts. For this study, 154 pyranosyl disaccharides were used that are dimers of the α or β anomers of d-glucose, d-galactose or d-mannose residues bonded through α or β glycosidic linkages of types 1→2, 1→3, 1→4, or 1→6, as well as methoxylated disaccharides. The (13)C NMR chemical shifts of the training set were calculated using the casper (Computer Assisted SPectrum Evaluation of Regular polysaccharides) program, and chemical shifts of the test set were experimental values obtained from the literature. Experiments were performed for (1) classification of the anomeric configuration, (2) classification of the type of linkage, and (3) classification of the residues. Classification trees could correctly classify 67%, 74%, and 38% of the test set for the three tasks, respectively, on the basis of unassigned chemical shifts. The results for the same experiments using Random Forests were 93%, 90%, and 68%, respectively.