Integrated querying and version control of context-specific biological networks


Cowman T., Coskun M., Grama A., Koyuturk M.

DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, cilt.2020, 2020 (SCI-Expanded) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 2020
  • Basım Tarihi: 2020
  • Doi Numarası: 10.1093/databa/baaa018
  • Dergi Adı: DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, BIOSIS, EMBASE, INSPEC, MEDLINE, Directory of Open Access Journals
  • Abdullah Gül Üniversitesi Adresli: Evet

Özet

Motivation: Biomolecular data stored in public databases is increasingly specialized to organisms, context/pathology and tissue type, potentially resulting in significant overhead for analyses. These networks are often specializations of generic interaction sets, presenting opportunities for reducing storage and computational cost. Therefore, it is desirable to develop effective compression and storage techniques, along with efficient algorithms and a flexible query interface capable of operating on compressed data structures. Current graph databases offer varying levels of support for network integration. However, these solutions do not provide efficient methods for the storage and querying of versioned networks. Results: We present VerTIoN, a framework consisting of novel data structures and associated query mechanisms for integrated querying of versioned context-specific biological networks. As a use case for our framework, we study network proximity queries in which the user can select and compose a combination of tissue-specific and generic networks. Using our compressed version tree data structure, in conjunction with state-of-the-art numerical techniques, we demonstrate real-time querying of large network databases. Conclusion: Our results show that it is possible to support flexible queries defined on heterogeneous networks composed at query time while drastically reducing response time for multiple simultaneous queries. The flexibility offered by VerTIoN in composing integrated network versions opens significant new avenues for the utilization of ever increasing volume of context-specific network data in a broad range of biomedical applications. Availability and Implementation: VerTIoN is implemented as a C++ library and is available at http://compbio.case.edu/omics/software/vertion and https://github.com/tjcowman/vertion Contact: tyler.cowman@case.edu