The Effect of Different Classifiers on Recursive Cluster Elimination in the Analysis of Transcriptomic Data


Bulut N., Bakir-Gungor B., Qaqish B. F., Yousef M.

2023 Innovations in Intelligent Systems and Applications Conference, ASYU 2023, Sivas, Türkiye, 11 - 13 Ekim 2023 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/asyu58738.2023.10296645
  • Basıldığı Şehir: Sivas
  • Basıldığı Ülke: Türkiye
  • Anahtar Kelimeler: Clustering, Feature Selection, Gene Expression Data Analysis, Recursive Cluster Elimination
  • Abdullah Gül Üniversitesi Adresli: Evet

Özet

Gene expression data with limited sample size and a large number of genes are frequently encountered in genetic studies. In such high-dimensional data, identification of genes that distinguish between disease states is a challenging task. Feature selection (FS) is a useful approach in dealing with high dimensionality. Support Vector Machines Recursive Cluster Elimination (SVM-RCE) is a technique for FS in high-dimensional data. The SVM-RCE approach has been utilized for identification of clusters of genes whose expression levels correlate with pathological state. A key step in SVM-RCE is the use of an SVM classifier to assign an area under the curve (AUC) score to each gene cluster based on its ability to predict class labels. In this study, we investigate the use of alternative classifiers in the cluster-scoring step. Specifically, we compare Support Vector Machines, Random Forest, XgBoost, Naive Bayes, and linear logistic regression. In addition to AUC score performance evaluation, the algorithms are compared in terms of the number of selected genes at different levels of clustering and in terms of the running time.