Oxford University Press (OUP), Bioinformatics, 5(25), p. 643-649
DOI: 10.1093/bioinformatics/btn662
Full text: Download
Motivation: Automatic classification of high-resolution mass spectrometry proteomic data has increasing potential in the early diagnosis of cancer. We propose a new procedure of biomarker discovery in serum protein profiles based on: (i) discrete wavelet transformation of the spectra; (ii) selection of discriminative wavelet coefficients by a statistical test and (iii) building and evaluating a support vector machine classifier by double cross-validation with attention to the generalizability of the results. In addition to the evaluation results (total recognition rate, sensitivity and specificity), the procedure provides the biomarker patterns, i.e. the parts of spectra which discriminate cancer and control individuals. The evaluation was performed on matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) serum protein profiles of 66 colorectal cancer patients and 50 controls.