Full text: Download
AbstractAimsThe objective of this research is to develop an effective cardiovascular disease prediction framework using machine learning techniques and to achieve high accuracy for the prediction of cardiovascular disease.MethodsIn this paper, we have utilized machine learning algorithms to predict cardiovascular disease on the basis of symptoms such as chest pain, age and blood pressure. This study incorporated five distinct datasets: Heart UCI, Stroke, Heart Statlog, Framingham and Coronary Heart dataset obtained from online sources. For the implementation of the framework, RapidMiner tool was used. The three‐step approach includes pre‐processing of the dataset, applying feature selection method on pre‐processed dataset and then applying classification methods for prediction of results. We addressed missing values by replacing them with mean, and class imbalance was handled using sample bootstrapping. Various machine learning classifiers were applied out of which random forest with AdaBoost dataset using 10‐fold cross‐validation provided the high accuracy.ResultsThe proposed model provides the highest accuracy of 99.48% on Heart Statlog, 93.90% on Heart UCI, 96.25% on Stroke dataset, 86% on Framingham dataset and 78.36% on Coronary heart disease dataset, respectively.ConclusionsIn conclusion, the results of the study have shown remarkable potential of the proposed framework. By handling imbalance and missing values, a significantly accurate framework has been established that could effectively contribute to the prediction of cardiovascular disease at early stages.