The objective of this study was to develop a medical language processing (MLP) system, which consisted of MedLEE and a set of inference rules, to identify 19 Charlson comorbidities from discharge summaries and chest x-ray reports. We used 233 cases to learn the patterns that were indicative of comorbidities for developing the inference rules. We then used an independent data set of 3,662 pneumonia patients to identify comorbidities by MLP compared with administrative data (ICD-9 codes). A stratified random sample of 190 records from disagreement cases was manually reviewed. The sensitivity, specificity, and accuracy for the MLP system/ICD-9 codes in this testing set were 0.84/0.16, 0.70/0.30, and 0.77/0.23 respectively. Thirteen of the 19 comorbidities studied were underreported in the administrative data. The kappa values ranged from 0.19 for peptic ulcer to 0.70 for lymphoma. We conclude that comorbidities derived from natural language processing of medical records can improve ICD-9-based approaches.