Dissemin is shutting down on January 1st, 2025

Published in

Journal of Medical Internet Research, Journal of Medical Internet Research, 8(22), p. e15394, 2020

DOI: 10.2196/15394

Links

Tools

Export citation

Search in Google Scholar

Applying Machine Learning Models with An Ensemble Approach for Accurate Real-Time Influenza Forecasting in Taiwan: Development and Validation Study

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

Background Changeful seasonal influenza activity in subtropical areas such as Taiwan causes problems in epidemic preparedness. The Taiwan Centers for Disease Control has maintained real-time national influenza surveillance systems since 2004. Except for timely monitoring, epidemic forecasting using the national influenza surveillance data can provide pivotal information for public health response. Objective We aimed to develop predictive models using machine learning to provide real-time influenza-like illness forecasts. Methods Using surveillance data of influenza-like illness visits from emergency departments (from the Real-Time Outbreak and Disease Surveillance System), outpatient departments (from the National Health Insurance database), and the records of patients with severe influenza with complications (from the National Notifiable Disease Surveillance System), we developed 4 machine learning models (autoregressive integrated moving average, random forest, support vector regression, and extreme gradient boosting) to produce weekly influenza-like illness predictions for a given week and 3 subsequent weeks. We established a framework of the machine learning models and used an ensemble approach called stacking to integrate these predictions. We trained the models using historical data from 2008-2014. We evaluated their predictive ability during 2015-2017 for each of the 4-week time periods using Pearson correlation, mean absolute percentage error (MAPE), and hit rate of trend prediction. A dashboard website was built to visualize the forecasts, and the results of real-world implementation of this forecasting framework in 2018 were evaluated using the same metrics. Results All models could accurately predict the timing and magnitudes of the seasonal peaks in the then-current week (nowcast) (ρ=0.802-0.965; MAPE: 5.2%-9.2%; hit rate: 0.577-0.756), 1-week (ρ=0.803-0.918; MAPE: 8.3%-11.8%; hit rate: 0.643-0.747), 2-week (ρ=0.783-0.867; MAPE: 10.1%-15.3%; hit rate: 0.669-0.734), and 3-week forecasts (ρ=0.676-0.801; MAPE: 12.0%-18.9%; hit rate: 0.643-0.786), especially the ensemble model. In real-world implementation in 2018, the forecasting performance was still accurate in nowcasts (ρ=0.875-0.969; MAPE: 5.3%-8.0%; hit rate: 0.582-0.782) and remained satisfactory in 3-week forecasts (ρ=0.721-0.908; MAPE: 7.6%-13.5%; hit rate: 0.596-0.904). Conclusions This machine learning and ensemble approach can make accurate, real-time influenza-like illness forecasts for a 4-week period, and thus, facilitate decision making.