Expanding the fundamental knowledge of TB with a gene regulatory network analysis tool (TBRNAT)

5-min interview in conversation with Ramandeep Kaur and Yamil Boo, bioinformatics software engineers in the Office of Cyber Infrastructure and Computational Biology (OCICB) at NIAID.

Kineviz
Kineviz
Published in
4 min readApr 20, 2021

--

This study takes a deeper dive into TBRNAT (Tuberculosis Regulatory Network Analysis Tool) from the National Institute of Allergy and Infectious Diseases (NIAID) Office of Cyber Infrastructure and Computational Biology (OCICB) in Bethesda, MD. TBRNAT was developed to amalgamate information from peer-reviewed publications related to the regulation of M. tuberculosis transcription factors on gene expression and provide users with an intuitive way to visualize this data.

Over the course of history, TB has killed more people than any other infectious disease. Over the past 200 years, it is estimated that at least 1 billion people have died from TB-more than the number of deaths resulting from malaria, smallpox, HIV/AIDS, cholera, plague, and influenza combined. TB remains the deadliest infectious disease in the world. In 2017, the WHO estimated that TB claimed the lives of 1.6 million people, including 230,000 children. Although TB treatment exists, drug resistance is a continued threat. The NIAID Strategic Plan for Tuberculosis Research published on September 26, 2018 outlines five research strategies, one of them being to improve fundamental knowledge of TB by understanding host and bacterial factors (and their interplay) that drive Mtb pathogenesis, transmission, and epidemiology. Then, elucidating the immune mechanisms responsible for limiting or failing to limit Mtb infection and disease.

What is TBRNAT and how does it work?

TBRNAT presents all known M. tuberculosis transcriptional factors that regulate the expression of a gene of interest as an interactive regulatory network along with additional data in tabular form. The application has a simple search interface, where you enter a gene’s ORF ID in the input field such as Rv0006 and it will display the network of genes that are associated with Rv0006. You can explore the displayed regulatory network using gene names and relevant publications. Our friend and colleague Dr. Vijay Nagarajan began developing it years ago with only a few thousand entries and using HTML, PHP, MySQL and Cytoscape. We later moved it to Angular, Node, Express and Neo4j (ANNE stack) and GraphXR to improve the visualization for users while being able to add thousands of entries.

How did this interactive data visualization tool come to be?

As part of the NIAID’s international Tuberculosis sequencing project (TB Portal), the BCBB team wanted to analyze the effect of significant Mtb mutations on the regulatory network of the M. tuberculosis virulence. For that purpose, Dr. Nagarajan developed a local version of the TBRNAT, which was later made available to benefit other TB researchers.

Where does the data originate from?

The data contained within the TBRNAT database was obtained by the manual curation of peer-reviewed articles in scientific literature.

How have you used graph databases in your analysis?

We use Neo4j in our analysis to query, visualize and understand the data better. With Neo4j we are able to traverse the data associations and interactions between the genes faster.

What are some of the data-driven challenges you’ve faced?

Curating the data manually and deploying it in Neo4j have been some of the challenges we’ve encountered. We’ve discovered that investing time up front in thinking about the database model/schema, pays off in the long run. Another challenge is balancing user input so that it fits into a schema. For example, when inputting data, we force users to choose certain keywords and also allow some room for them to add data in a user-defined and unstructured way.

How does NIAID drive this research? What is your approach?

TBRNAT is a unique resource since it integrates with other human, mouse and rat datasets in the backend while providing the ability for advanced researchers to navigate not only the Mtb regulatory networks but also the host-pathogen associations in the context of virulence regulatory networks.

What have been your findings with TBRNAT?

TBRNAT helps with understanding the virulence regulatory network along with giving us the potential for discovering missing connections and intermediate regulatory nodes that could affect the virulence mechanism. This analysis approach provides hints towards novel markers and therapeutic targets.

What does the future of this research look like?

We have over 61,000 entries in our database and we hope to add more in the future allowing our users to analyze and discover new regulatory associations. We are also planning to extend our collaboration with researchers inside and outside of the NIH to benefit the entire M.tuberculosis research community.

Originally published at www.kineviz.com on April 20, 2021.

--

--

Kineviz
Kineviz

Kineviz, Inc. is the leader in cutting-edge human interfaces for data visualization that unite art and humanity with technology.