MicroRNA prediction based on 3D graphical representation of RNA secondary structures


Creative Commons License

SAÇAR DEMİRCİ M. D.

TURKISH JOURNAL OF BIOLOGY, cilt.43, sa.4, ss.274-286, 2019 (SCI-Expanded) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 43 Sayı: 4
  • Basım Tarihi: 2019
  • Doi Numarası: 10.3906/biy-1904-59
  • Dergi Adı: TURKISH JOURNAL OF BIOLOGY
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, TR DİZİN (ULAKBİM)
  • Sayfa Sayıları: ss.274-286
  • Anahtar Kelimeler: MicroRNA, RNA structure, machine learning, random forest, decision tree, naive Bayes, CLASSIFICATION, REAL
  • Abdullah Gül Üniversitesi Adresli: Evet

Özet

MicroRNAs (miRNAs) are posttranscriptional regulators of gene expression. While a miRNA can target hundreds of messenger RNA (mRNAs), an mRNA can be targeted by different miRNAs, not to mention that a single miRNA might have various binding sites in an mRNA sequence. Therefore, it is quite involved to investigate miRNAs experimentally. Thus, machine learning (ML) is frequently used to overcome such challenges. The key parts of a ML analysis largely depend on the quality of input data and the capacity of the features describing the data. Previously, more than 1000 features were suggested for miRNAs. Here, it is shown that using 36 features representing the RNA secondary structure and its dynamic 3D graphical representation provides up to 98% accuracy values. In this study, a new approach for ML-based miRNA prediction is proposed. Thousands of models are generated through classification of known human miRNAs and pseudohairpins with 3 classifiers: decision tree, naive Bayes, and random forest. Although the method is based on human data, the best model was able to correctly assign 96% of nonhuman hairpins from MirGeneDB, suggesting that this approach might be useful for the analysis of miRNAs from other species.