Dissemin is shutting down on January 1st, 2025

Published in

Oxford University Press, JAMIA: A Scholarly Journal of Informatics in Health and Biomedicine, 3(23), p. 627-634, 2015

DOI: 10.1093/jamia/ocv156

Links

Tools

Export citation

Search in Google Scholar

Assessing race and ethnicity data quality across cancer registries and EMRs in two hospitals

Journal article published in 2015 by Simon J. Craddock Lee ORCID, James E. Grobe, Jasmin A. Tiro
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Background Measurement of patient race/ethnicity in electronic health records is mandated and important for tracking health disparities. Objective Characterize the quality of race/ethnicity data collection efforts. Methods For all cancer patients diagnosed (2007–2010) at two hospitals, we extracted demographic data from five sources: 1) a university hospital cancer registry, 2) a university electronic medical record (EMR), 3) a community hospital cancer registry, 4) a community EMR, and 5) a joint clinical research registry. The patients whose data we examined (N = 17 834) contributed 41 025 entries (range: 2–5 per patient across sources), and the source comparisons generated 1–10 unique pairs per patient. We used generalized estimating equations, chi-squares tests, and kappas estimates to assess data availability and agreement. Results Compared to sex and insurance status, race/ethnicity information was significantly less likely to be available (χ2 > 8043, P < .001), with variation across sources (χ2 > 10 589, P < .001). The university EMR had a high prevalence of “Unknown” values. Aggregate kappa estimates across the sources was 0.45 (95% confidence interval, 0.45–0.45; N = 31 276 unique pairs), but improved in sensitivity analyses that excluded the university EMR source (κ = 0.89). Race/ethnicity data were in complete agreement for only 6988 patients (39.2%). Pairs with a “Black” data value in one of the sources had the highest agreement (95.3%), whereas pairs with an “Other” value exhibited the lowest agreement across sources (11.1%). Discussion Our findings suggest that high-quality race/ethnicity data are attainable. Many of the “errors” in race/ethnicity data are caused by missing or “Unknown” data values. Conclusions To facilitate transparent reporting of healthcare delivery outcomes by race/ethnicity, healthcare systems need to monitor and enforce race/ethnicity data collection standards.