Ocean of Data — Real Time Solution to Test Data Challenges

Alok Singh
Airtel Digital
Published in
4 min readJan 9, 2023

What Is Test Data Management ?

Test data management (TDM) is the process of providing high-quality data for testing purposes. The TDM process is responsible for creating the data and ensuring that data has the expected quality and is readily available when the test processes need it in the expected amounts and formats.

Why Is Test Data Management Important?

There are several types of software testing that an organisation can leverage in its test automation strategy. Some forms of testing either don’t require data, or the data they require can be incorporated into the test cases themselves. A classic example would be system tests since their goal is to test each system in complete isolation.

Other forms of testing do need data, such as end-to-end testing. And since these tests rely on data, you must ensure they get access to high-quality data. If your tests have to work with faulty or invalid data, you won’t be able to trust their results, regardless of the excellence of your QA strategy. In other words: garbage in, garbage out.

TDM is fast gaining importance in the QA industry. Behind increasing interest in TDM are major financial losses caused by production defects, which could have been detected by testing with the proper test data.

In the past, test data was limited to a few rows of data in the database or a few sample input files. Since then, the QA landscape has come a long way. Now institutions rely on powerful test data sets with unique combinations giving them high coverage to drive the QA, including negative testing.

Challenges with Test Data

  • SCATTERED DATA — Stale copy of data available due to the complexity of integrated environment across categories. QA teams do not have E2E knowledge of all the channels, Platforms and categories (upstream and downstream) which acts as a bottleneck in creating the right test data.
  • DATA INTEGRITY— Lack of high fidelity data across Categories and Platforms. Test Data created is not available on all the integrated components due to some or the other integration / environmental issue.
  • HIGH TURNAROUND TIME — Creating data on demand that is valid across involved components across categories increases overall turn around time.Most of the data creation happens during the course of execution based on learning.
  • RISK OF DEFECT LEAKAGE— Not having access to relevant test data within the release cycle can lead to defect leakage to production.

Our Real Time Solution to the Test Data Challenges


Ocean of Data is an Interactive BOT which interacts with all the categories across Airtel and provides On Demand Test Data to users.

  • It provides Real time access to Test Data across Airtel categories.
  • Modern Test Data solution providing integrated data seamlessly in hybrid and complex environment in parallel and at speed.
  • Data Integrity — All the data available on the BOT is duly verified on all the required systems for data integrity moreover Data is only shared to authenticated users in Airtel.
  • CI/CD Integration — BOT is integrated with JIRA and JENKINS to take request and create automated job for Data if data is not available in system already.
  • Centralised Test Data Pool — Gives data to user in realtime from across the categories as per the user requirement
  • High Quality Data Is maintained by validating the available data across the downstream.
  • Self Service Provisioning — Helps creating data on demand If the same is already not available in the centralised data pool.
  • BOT made sure that Data Dependency on downstream will not impact any deliverable. Moreover this BOT significantly reduces the time required for Data Creation and will support quicker Time to Market across Airtel

Glimpse of the BOT


In order to provide value-add in QA, increasing data coverage is critical. However, because of the large amount of test data that we utilize in regression suites on a regular basis, it is a critical emphasis area in terms of ROI. The right TDM solutions can help you provide a wide range of data while maintaining a consistent ROI across each cycle.

With Ocean of Data — BOT, however, the TDM process does become a bit easier. Due to the accuracy of the test data, data-related defects, or false positives, will reduce enormously, thereby increasing the efficiency of the testing process. Up to 75% of the QA’s time can be lost during the testing phase if they’re busy looking for or waiting for test data. Ocean of Data cuts out this delay by having the data ready in time for the start of the test phase and allows for more accurate testing and reduces the number of issues that make it into production. And simply put, fewer delays and faster time to market.

Substantial Credit for this whole implementation goes to Pulkit Munjral. Special thanks to Anubhav Yadav for valuable feedback and support on this.

