Topological feature generation for link prediction in biological networks


Temiz M., GÜNGÖR B., Güner Şahan P., COŞKUN M.

PeerJ, cilt.11, 2023 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 11
  • Basım Tarihi: 2023
  • Doi Numarası: 10.7717/peerj.15313
  • Dergi Adı: PeerJ
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, BIOSIS, MEDLINE, Veterinary Science Database, Directory of Open Access Journals
  • Anahtar Kelimeler: Feature generation, Graph embedding, Link prediction, Machine learning, Protein-protein interaction
  • Abdullah Gül Üniversitesi Adresli: Evet

Özet

Graph or network embedding is a powerful method for extracting missing or potential information from interactions between nodes in biological networks. Graph embedding methods learn representations of nodes and interactions in a graph with low-dimensional vectors, which facilitates research to predict potential interactions in networks. However, most graph embedding methods suffer from high computational costs in the form of high computational complexity of the embedding methods and learning times of the classifier, as well as the high dimensionality of complex biological networks. To address these challenges, in this study, we use the Chopper algorithm as an alternative approach to graph embedding, which accelerates the iterative processes and thus reduces the running time of the iterative algorithms for three different (nervous system, blood, heart) undirected protein-protein interaction (PPI) networks. Due to the high dimensionality of the matrix obtained after the embedding process, the data are transformed into a smaller representation by applying feature regularization techniques. We evaluated the performance of the proposed method by comparing it with state-of-the-art methods. Extensive experiments demonstrate that the proposed approach reduces the learning time of the classifier and performs better in link prediction. We have also shown that the proposed embedding method is faster than state-of-the-art methods on three different PPI datasets.