Published in

Elsevier, Journal of Investigative Dermatology, 8(132), p. 2005-2009, 2012

DOI: 10.1038/jid.2012.98

Links

Tools

Export citation

Search in Google Scholar

Validation of claims data algorithms to identify nonmelanoma skin cancer

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Health maintenance organization (HMO) administrative databases have been used as sampling frames for ascertaining nonmelanoma skin cancer (NMSC). However, because of the lack of tumor registry information on these cancers, these ascertainment methods have not been previously validated. NMSC cases arising from patients served by a staff model medical group and diagnosed between 1 January 2007 and 31 December 2008 were identified from claims data using three ascertainment strategies. These claims data cases were then compared with NMSC identified using natural language processing (NLP) of electronic pathology reports (EPRs), and sensitivity, specificity, positive and negative predictive values were calculated. Comparison of claims data-ascertained cases with the NLP demonstrated sensitivities ranging from 48 to 65% and specificities from 85 to 98%, with ICD-9-CM ascertainment demonstrating the highest case sensitivity, although the lowest specificity. HMO health plan claims data had a higher specificity than all-payer claims data. A comparison of EPR and clinic log registry cases showed a sensitivity of 98% and a specificity of 99%. Validation of administrative data to ascertain NMSC demonstrates respectable sensitivity and specificity, although NLP ascertainment was superior. There is a substantial difference in cases identified by NLP compared with claims data, suggesting that formal surveillance efforts should be considered.