Ensemble feature selection and classification methods for machine learning-based coronary artery disease diagnosis

KOLUKISA B., Bakir-Gungor B.

Computer Standards and Interfaces, vol.84, 2023 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 84
  • Publication Date: 2023
  • Doi Number: 10.1016/j.csi.2022.103706
  • Journal Name: Computer Standards and Interfaces
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Compendex, Computer & Applied Sciences, INSPEC, Linguistic Bibliography, Metadex, Civil Engineering Abstracts
  • Keywords: Machine learning, Classification, Ensemble feature selection, Domain knowledge-based feature selection, Coronary artery disease diagnosis
  • Abdullah Gül University Affiliated: Yes


Coronary artery disease (CAD) is a condition in which the heart is not fed sufficiently as a result of the accumulation of fatty matter. As reported by the World Health Organization, around 32% of the total deaths in the world are caused by CAD, and it is estimated that approximately 23.6 million people will die from this disease in 2030. CAD develops over time, and the diagnosis of this disease is difficult until a blockage or a heart attack occurs. In order to bypass the side effects and high costs of the current methods, researchers have proposed to diagnose CADs with computer-aided systems, which analyze some physical and biochemical values at a lower cost. In this study, for the CAD diagnosis, (i) seven different computational feature selection (FS) methods, one domain knowledge-based FS method, and different classification algorithms have been evaluated; (ii) an exhaustive ensemble FS method and a probabilistic ensemble FS method have been proposed. The proposed approach is tested on three publicly available CAD data sets using six different classification algorithms and four different variants of voting algorithms. The performance metrics have been comparatively evaluated with numerous combinations of classifiers and FS methods. The multi-layer perceptron classifier obtained satisfactory results on three data sets. Performance evaluations show that the proposed approach resulted in 91.78%, 85.55%, and 85.47% accuracy for the Z-Alizadeh Sani, Statlog, and Cleveland data sets, respectively.