Test Data Management (TDM) for enterprise application development and testing

Kasun de Silva
Sysco LABS Sri Lanka
5 min readJan 31, 2023

Enterprise application development is the process of building and deploying scalable applications for large organizations to help with their business needs. These applications are built over the years on different technology stacks. Modern technologies like MongoDB, ExpressJs, React, NodeJs, etc or legacy like RPG, AS400, Cobol, etc. It is important to note that all these applications need to work together to achieve business functionality.

A typical snapshot of an integrated production system would look like this -

NOTE: the above is a minimal representation of how applications in an enterprise would look. An integrated environment may have 50,60 or even more applications working together seamlessly.

Testing applications developed using different technology stacks to ensure that they work seamlessly in a “non-production” environment is of paramount importance to achieve digital assurance for business continuity.

Consider the following:

  1. Can each application development team have a dedicated integrated test environment that has all the applications put together?
  2. Given the scale of active application development, can there be a demand for users to crave the same type of data?
  3. Is it important that the product engineering teams test with “production-like” data?
  4. Can application development teams generate data on dependent systems without any knowledge of the features of the dependent application?

Test Data Management

Test Data Management (TDM) is the process that creates, manages and delivers test data to application teams in order to carry out efficient development and testing.

Dedicated integrated test environments for each team could be the best option here, but due to the extremely high costs involved, along with a high manual effort to provision infrastructure and integrate applications together. The practicality of this is questionable (imagine having to build and maintain 50-plus integrated test environments!).

Test Data Reservation

Test Data Reservation is a capability in TDM that will allow users to set aside some data for future use. This is important, since users may tend to crave for the same data points resulting in a test data usage conflict.

user N — is the primary user of application N
user 1,2 and 3 — need data from application N to carry out their validations
item 1 is of high demand
first user to secure item 1 will benefit, 3 users will need alternatives at the time of validation

What if, users could reserve data prior to validation?

Test Data Refresh

Test Data Refresh is a capability in TDM that will allow existing data in a non-production environment to be replaced with real “sanitized” data from production.

user N is in need of data from production in order to carry out validations

What if, users could refresh data on demand from production systems to test environments?

Test Data Generation

Test Data Generation is a capability in TDM that will allow users to create test data on dependent systems in order to carry out validations on the application under test.

user 1 — is unaware of the business flows on Application N to create item 4
user N — is unable to absorb the effort needed to create item 4

What if, users could generate test data on demand on any dependent application?

Implementing TDM

Implementation of TDM for an enterprise may involve the following steps.

Discovering the TDM requirements

  1. Understanding the test data needs of each team is needed in order to justify the need for TDM practices within an enterprise.
  2. The most common techniques to collect data include the usage of questionnaires and setting up one on one meetings to understand the challenges.
  3. Once the data is collected, it can be represented in the form of a TDM heatmap to understand where the enterprise stand
  4. Based on the heatmap, TDM capabilities to implement can be prioritized
Example of a TDM heatmap

Prioratization could look like -

P0 — Test Data Reservation
P1 — Test Data Refresh
P2 — Test Data Generation

Tool fitment and Introduction

There is a wide variety of readily available tools out there in the market that can help with the implementation of TDM. Features provided by the tools can be analyzed to understand if the TD needs identified, can be met.

A sample tool analysis could look like this -

TDM tool capability matrix
TDM tool support/scalability matrix

NOTE: Costing will have to be discussed as well, this is not added due to the sensitivity of the data

Another option that organizations may look at is to build their own TDM tool, either way, the ROI will have to be thoroughly analyzed to understand the cost vs benefits provided prior to purchasing a tool or building their own.

Benefits of using a TDM tool

Using a Test Data Management (TDM) tool can provide several benefits that can result in a positive return on investment (ROI). Some potential benefits of using a TDM tool include the following:​​

  • Improved testing efficiency: A TDM tool can help automate the creation and management of test data, which can reduce the time and effort needed to set up test environments. This can lead to faster test execution and shorter testing cycles, which can increase the overall efficiency of the testing process.​​
  • Better quality of testing: With a TDM tool, you can create and use more realistic and representative test data, which can lead to more accurate testing results and help identify defects earlier in the development process.​​
  • Increased flexibility: A TDM tool can allow easily create and use of test data for different environments, scenarios, and user profiles. This can increase the flexibility and adaptability of the testing efforts.​​
  • Enhanced security: All TDM tools have built-in data masking and de-identification capabilities, which can help protect sensitive data and meet compliance requirements.​
  • Cost savings: By automating the creation and management of test data, a TDM tool can help reduce the time and resources needed for these tasks. This can lead to cost savings and more efficient use of resources.

--

--