Sample Size Analysis for Machine Learning Clinical Validation Studies

Goldenholz, Daniel M.; Sun, Haoqi; Westover, M. Brandon; Ganglberger, Wolfgang; Brandon Westover, M.

Published in

MDPI, Biomedicines, 3(11), p. 685, 2023

DOI: 10.3390/biomedicines11030685

Tools

Export citation

Search in Google Scholar

Sample Size Analysis for Machine Learning Clinical Validation Studies

Journal article published in 2023 by Daniel M. Goldenholz

, Haoqi Sun, M. Brandon Westover, Wolfgang Ganglberger

, M. Brandon Westover

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

Background: Before integrating new machine learning (ML) into clinical practice, algorithms must undergo validation. Validation studies require sample size estimates. Unlike hypothesis testing studies seeking a p-value, the goal of validating predictive models is obtaining estimates of model performance. There is no standard tool for determining sample size estimates for clinical validation studies for machine learning models. Methods: Our open-source method, Sample Size Analysis for Machine Learning (SSAML) was described and was tested in three previously published models: brain age to predict mortality (Cox Proportional Hazard), COVID hospitalization risk prediction (ordinal regression), and seizure risk forecasting (deep learning). Results: Minimum sample sizes were obtained in each dataset using standardized criteria. Discussion: SSAML provides a formal expectation of precision and accuracy at a desired confidence level. SSAML is open-source and agnostic to data type and ML model. It can be used for clinical validation studies of ML models.

Published in

Links

Tools

Sample Size Analysis for Machine Learning Clinical Validation Studies

Abstract