MongoDB and Neo4J together: Real-time data analytics for Healthcare

Dr. GP Pulipaka
Sep 26, 2016 · 2 min read
Neo4J graph database

Zephyr health is one of the largest global health information providers that applies data science with ensemble methods and multiple NoSQL databases such as MongoDB and Neo4J to provide the information to the healthcare consumers through real-time analytics for providing the necessary therapies and treatments for chronic diseases. Zephyr operates in several continents such as US, EMEA (Europe, Middle East regions). The healthcare providers and general physicians are moving away from providing a generic treatment to providing precision medicine that address the particular problems with the epidemics or outbreaks. For example, big pharma company GlaxoSmithKline intends to launch a new drug into the market that treats asthma disease. GlaxoSmithKline has enormous amount of data from the Salesforce and CRM systems and a number of data sources that provides RMS data, a connected network of hospitals, pharmacies, radiology departments. Regardless of such vast collection of information, GlaxoSmithKline is still unable to provide the precision medicine in a way that meets the requirements of the healthcare consumer. This is where Zephyr excels. It collects all the information and integrates into a one single data platform. Zephyr can now offer the precision medicine that meets the healthcare consumer requirements for the newly introduced drug in the market. The electronic healthcare records from the physicians, clinical trails open data, and a treasure trove of publication data is collected by Zephyr. Only Zephyr knows how to apply this collected data for the introduction of new drug into the market. Zephyr has chosen MongoDB and Neo4j databases for the schema-less NoSQL format with sharding capabilities. Considering the nature of the precision medicine for healthcare consumers, Zephyr adopted a polyglot persistence approach by introducing both MongoDB and Neo4J into their data platform. The main data points of healthcare consumer are represented in MongoDB, and a number of relationships associated with the consumer are maintained in Neo4J for nested database architecture. Sparse indexes are created on MongoDB, while heavy indexes are created on Neo4J. MongoDB maintains only profile-centric data, while Neo4J maintains the nested relationship associations in the database (Chaudhari, 2016).


Chaudhari, M. (2016). Integrating Diverse Healthcare Data using MongoDB and Neo4j. Retrieved July 24, 2016, from

Chris (2016). 10 tips for passing the Neo4j Certified Professional examination [Neo4J graph image]. Retrieved September 26, 2016, from