Full text: Download
Diabetes mellitus (DM) is one of the most prevalent diseases in the world, and is correlated to a high index of mortality. One of its major complications is diabetic foot, leading to plantar ulcers, amputation, and death. Several studies report that a thermogram helps to detect changes in the plantar temperature of the foot, which may lead to a higher risk of ulceration. However, in diabetic patients, the distribution of plantar temperature does not follow a standard pattern, thereby making it difficult to quantify the changes. The abnormal temperature distribution in infrared (IR) foot thermogram images can be used for the early detection of diabetic foot before ulceration to avoid complications. There is no machine learning-based technique reported in the literature to classify these thermograms based on the severity of diabetic foot complications. This paper uses an available labeled diabetic thermogram dataset and uses the k-mean clustering technique to cluster the severity risk of diabetic foot ulcers using an unsupervised approach. Using the plantar foot temperature, the new clustered dataset is verified by expert medical doctors in terms of risk for the development of foot ulcers. The newly labeled dataset is then investigated in terms of robustness to be classified by any machine learning network. Classical machine learning algorithms with feature engineering and a convolutional neural network (CNN) with image-enhancement techniques are investigated to provide the best-performing network in classifying thermograms based on severity. It is found that the popular VGG 19 CNN model shows an accuracy, precision, sensitivity, F1-score, and specificity of 95.08%, 95.08%, 95.09%, 95.08%, and 97.2%, respectively, in the stratification of severity. A stacking classifier is proposed using extracted features of the thermogram, which is created using the trained gradient boost classifier, XGBoost classifier, and random forest classifier. This provides a comparable performance of 94.47%, 94.45%, 94.47%, 94.43%, and 93.25% for accuracy, precision, sensitivity, F1-score, and specificity, respectively.