Published in

Public Library of Science, PLoS ONE, 3(16), p. e0248360, 2021

DOI: 10.1371/journal.pone.0248360

Links

Tools

Export citation

Search in Google Scholar

Integrating human services and criminal justice data with claims data to predict risk of opioid overdose among Medicaid beneficiaries: A machine-learning approach

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

Health system data incompletely capture the social risk factors for drug overdose. This study aimed to improve the accuracy of a machine-learning algorithm to predict opioid overdose risk by integrating human services and criminal justice data with health claims data to capture the social determinants of overdose risk. This prognostic study included Medicaid beneficiaries (n = 237,259) in Allegheny County, Pennsylvania enrolled between 2015 and 2018, randomly divided into training, testing, and validation samples. We measured 290 potential predictors (239 derived from Medicaid claims data) in 30-day periods, beginning with the first observed Medicaid enrollment date during the study period. Using a gradient boosting machine, we predicted a composite outcome (i.e., fatal or nonfatal opioid overdose constructed using medical examiner and claims data) in the subsequent month. We compared prediction performance between a Medicaid claims only model to one integrating human services and criminal justice data with Medicaid claims (i.e., integrated model) using several metrics (e.g., C-statistic, number needed to evaluate [NNE] to identify one overdose). Beneficiaries were stratified into risk-score decile subgroups. The samples (training = 79,087, testing = 79,086, validation = 79,086) had similar characteristics (age = 38±18 years, female = 56%, white = 48%, having at least one overdose = 1.7% during study period). Using the validation sample, the integrated model slightly improved on the Medicaid claims only model (C-statistic = 0.885; 95%CI = 0.877–0.892 vs. C-statistic = 0.871; 95%CI = 0.863–0.878), with small corresponding improvements in the NNE and positive predictive value. Nine of the top 30 most important predictors in the integrated model were human services and criminal justice variables. Using the integrated model, approximately 70% of individuals with overdoses were members of the top risk decile (overdose rates in the subsequent month = 47/10,000 beneficiaries). Few individuals in the bottom 9 deciles had overdose episodes (0-12/10,000). Machine-learning algorithms integrating claims and social service and criminal justice data modestly improved opioid overdose prediction among Medicaid beneficiaries for a large U.S. county heavily affected by the opioid crisis.