Published in

Taylor & Francis, 2022

DOI: 10.6084/m9.figshare.19537817

Taylor and Francis Group, The American Journal of Drug and Alcohol Abuse, 3(48), p. 260-271, 2022

DOI: 10.1080/00952990.2021.1995739

Links

Tools

Export citation

Search in Google Scholar

Practical foundations of machine learning for addiction research. Part I. Methods and techniques

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Machine learning assembles a broad set of methods and techniques to solve a wide range of problems, such as identifying individuals with substance use disorders (SUD), finding patterns in neuroimages, understanding SUD prognostic factors and their association, or determining addiction genetic underpinnings. However, the addiction research field underuses machine learning. This two-part narrative review focuses on machine learning tools and concepts, providing an introductory insight into their capabilities to facilitate their understanding and acquisition by addiction researchers. This first part presents supervised and unsupervised methods such as linear models, naive Bayes, support vector machines, artificial neural networks, and k-means. We illustrate each technique with examples of its use in current addiction research. We also present some open-source programming tools and methodological good practices that facilitate using these techniques. Throughout this work, we emphasize a continuum between applied statistics and machine learning, we show their commonalities, and provide sources for further reading to deepen the understanding of these methods. This two-part review is a primer for the next generation of addiction researchers incorporating machine learning in their projects. Researchers will find a bridge between applied statistics and machine learning, ways to expand their analytical toolkit, recommendations to incorporate well-established good practices in addiction data analysis (e.g., stating the rationale for using newer analytical tools, calculating sample size, improving reproducibility), and the vocabulary to enhance collaboration between researchers who do not conduct data analyses and those who do.