MicroRNA prediction based on 3D graphical representation of RNA secondary structures

Creative Commons License


TURKISH JOURNAL OF BIOLOGY, vol.43, no.4, pp.274-286, 2019 (SCI-Expanded) identifier identifier identifier

  • Publication Type: Article / Article
  • Volume: 43 Issue: 4
  • Publication Date: 2019
  • Doi Number: 10.3906/biy-1904-59
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, TR DİZİN (ULAKBİM)
  • Page Numbers: pp.274-286
  • Keywords: MicroRNA, RNA structure, machine learning, random forest, decision tree, naive Bayes, CLASSIFICATION, REAL
  • Abdullah Gül University Affiliated: Yes


MicroRNAs (miRNAs) are posttranscriptional regulators of gene expression. While a miRNA can target hundreds of messenger RNA (mRNAs), an mRNA can be targeted by different miRNAs, not to mention that a single miRNA might have various binding sites in an mRNA sequence. Therefore, it is quite involved to investigate miRNAs experimentally. Thus, machine learning (ML) is frequently used to overcome such challenges. The key parts of a ML analysis largely depend on the quality of input data and the capacity of the features describing the data. Previously, more than 1000 features were suggested for miRNAs. Here, it is shown that using 36 features representing the RNA secondary structure and its dynamic 3D graphical representation provides up to 98% accuracy values. In this study, a new approach for ML-based miRNA prediction is proposed. Thousands of models are generated through classification of known human miRNAs and pseudohairpins with 3 classifiers: decision tree, naive Bayes, and random forest. Although the method is based on human data, the best model was able to correctly assign 96% of nonhuman hairpins from MirGeneDB, suggesting that this approach might be useful for the analysis of miRNAs from other species.