Machine Learning based Early Prediction of Type 2 Diabetes: A New Hybrid Feature Selection Approach using Correlation Matrix with Heatmap and SFS

Buyrukoğlu S., Akbaş A.

Balkan Journal of Electrical and Computer Engineering, vol.10, no.2, pp.110-117, 2022 (Peer-Reviewed Journal) identifier


A new hybrid machine learning method for the prediction of type 2 diabetes is introduced and explained in detail. Also, outcomes are compared with similar researches. Early prediction of diabetes is crucial to take necessary measures (i.e. changing eating habits, patient weight control etc.), to defer the emergence of diabetes and to reduce the death rate to some extent and ease medical care professionals’ decision-making in preventing and managing diabetes mellitus. The purpose of this study is the creation of a new hybrid feature selection approach combination of Correlation Matrix with Heatmap and Sequential forward selection (SFS) to reveal the most effective features in the detection of diabetes. A diabetes data set with 520 instances and seven features were studied with the application of the proposed hybrid feature selection approach. The evaluation of the selected optimal features was measured by applying Support Vector Machines(SVM), Random Forest(RF), and Artificial Neural Networks(ANN) classifiers. Five evaluation metrics, namely, Accuracy, F-measure, Precision, Recall, and AUC showed the best performance with ANN (99.1%), F-measure (99.1%), Precision (99.3%), Recall (99.1%), and AUC (99.2%). Our proposed hybrid feature selection model provided a more promising performance with ANN compared to other machine learning algorithms.