Dissemin is shutting down on January 1st, 2025

Published in

Public Library of Science, PLoS ONE, 3(18), p. e0282235, 2023

DOI: 10.1371/journal.pone.0282235

Links

Tools

Export citation

Search in Google Scholar

Applied machine learning to identify differential risk groups underlying externalizing and internalizing problem behaviors trajectories: A case study using a cohort of Asian American children

Journal article published in 2023 by Samrachana Adhikari ORCID, Shiying You, Alan Chen, Sabrina Cheng, Keng-Yen Huang
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

Background Internalizing and externalizing problems account for over 75% of the mental health burden in children and adolescents in the US, with higher burden among minority children. While complex interactions of multilevel factors are associated with these outcomes and may enable early identification of children in higher risk, prior research has been limited by data and application of traditional analysis methods. In this case example focused on Asian American children, we address the gap by applying data-driven statistical and machine learning methods to study clusters of mental health trajectories among children, investigate optimal predictions of children at high-risk cluster, and identify key early predictors. Methods Data from the US Early Childhood Longitudinal Study 2010–2011 were used. Multilevel information provided by children, families, teachers, schools, and care-providers were considered as predictors. Unsupervised machine learning algorithm was applied to identify groups of internalizing and externalizing problems trajectories. For prediction of high-risk group, ensemble algorithm, Superlearner, was implemented by combining several supervised machine learning algorithms. Performance of Superlearner and candidate algorithms, including logistic regression, was assessed using discrimination and calibration metrics via crossvalidation. Variable importance measures along with partial dependence plots were utilized to rank and visualize key predictors. Findings We found two clusters suggesting high- and low-risk groups for both externalizing and internalizing problems trajectories. While Superlearner had overall best discrimination performance, logistic regression had comparable performance for externalizing problems but worse for internalizing problems. Predictions from logistic regression were not well calibrated compared to those from Superlearner, however they were still better than few candidate algorithms. Important predictors identified were combination of test scores, child factors, teacher rated scores, and contextual factors, which showed non-linear associations with predicted probabilities. Conclusions We demonstrated the application of data-driven analytical approach to predict mental health outcomes among Asian American children. Findings from the cluster analysis can inform critical age for early intervention, while prediction analysis has potential to inform intervention programing prioritization decisions. However, to better understand external validity, replicability, and value of machine learning in broader mental health research, more studies applying similar analytical approach is needed.