Published in

Springer Nature [academic journals on nature.com], Translational Psychiatry, 1(10), 2020

DOI: 10.1038/s41398-020-00962-8

Links

Tools

Export citation

Search in Google Scholar

A machine-learning framework for robust and reliable prediction of short- and long-term treatment response in initially antipsychotic-naïve schizophrenia patients based on multimodal neuropsychiatric data

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Red circle
Postprint: archiving forbidden
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

AbstractThe reproducibility of machine-learning analyses in computational psychiatry is a growing concern. In a multimodal neuropsychiatric dataset of antipsychotic-naïve, first-episode schizophrenia patients, we discuss a workflow aimed at reducing bias and overfitting by invoking simulated data in the design process and analysis in two independent machine-learning approaches, one based on a single algorithm and the other incorporating an ensemble of algorithms. We aimed to (1) classify patients from controls to establish the framework, (2) predict short- and long-term treatment response, and (3) validate the methodological framework. We included 138 antipsychotic-naïve, first-episode schizophrenia patients with data on psychopathology, cognition, electrophysiology, and structural magnetic resonance imaging (MRI). Perinatal data and long-term outcome measures were obtained from Danish registers. Short-term treatment response was defined as change in Positive And Negative Syndrome Score (PANSS) after the initial antipsychotic treatment period. Baseline diagnostic classification algorithms also included data from 151 matched controls. Both approaches significantly classified patients from healthy controls with a balanced accuracy of 63.8% and 64.2%, respectively. Post-hoc analyses showed that the classification primarily was driven by the cognitive data. Neither approach predicted short- nor long-term treatment response. Validation of the framework showed that choice of algorithm and parameter settings in the real data was successfully guided by results from the simulated data. In conclusion, this novel approach holds promise as an important step to minimize bias and obtain reliable results with modest sample sizes when independent replication samples are not available.