Dynamic Data for Automated Testing

Luis Marchani
Globant
Published in
5 min readNov 17, 2020

Introduction

Data Management for Testing Purposes is always a challenge. There are many things that could happen at the time of Test Execution.

Probably many of you have had the experience of getting different results at the time of Executing Test Cases in Parallel, The data gets changed across different Test Cases and It cannot be used again, Database gets full of Random Test Data, etc.

So, Let’s slow down first to Speed up later. We need to be confident about the way we are adding Test Data to our Test Cases.

Solution

Usually the first Strategy about dealing with data in lower environments is to replicate Production Data and work with it.

Ok, this is the Best Scenario for starters.

Gather data. Either by getting a replica from Production or working with Data provided by the Development team in Earlier Stages of SDLC. This is the First Step.

What do we do with this?…… We should go ahead and Analyze the different scenarios We are about to automate and classify them by module, functionality, data related, etc.

This Step is really important since it will allow us to identify the data that We can use in a Test Cases or group of Test Cases.

Let me give you an Example:. Let’s say Our application has Certain Functionality that blocks a section of our application. Now We also have certain Functionality to Unblock a section of our application. So, What would happen if we run both Test Cases against the same Module in Parallel? Right!! There’s a big chance of this to fail. Because We will be blocking and unblocking the same Module at the same time.

And this is not only at the time of Validating the Results but at the time of setting the Prerequisites of The Test. We could be unblocking the Module for One Scenario to be able to Unblock it as the actual Test Case flow while we do the opposite in the Second Scenario. So in this case probably We will be getting The Failure in an earlier step.

So this could be something to consider. If we are going to work with actions that could affect the same data, let’s make sure it’s not about the related functionality otherwise We should group them to run sequentially.

What else could happen?… We might have certain functionality to delete a user in our Application. And Maybe We have some Scenario testing something related to said User. So we have several options here: We can recreate the User as a prerequisite, we can create a new user every time, we can have a different user and make sure it’s not used somewhere else so it never gets deleted, etc.

What if we take the last approach. We have a user and make sure it’s not being used somewhere else. This could happen many times and we would need several users for Test cases with the same behavior. If We have thousands of Test Cases this could become a nightmare. A good approach in this scenario would be to isolate an account for Deletion Purposes (This account could be recreated when needed), The Isolation approach for accounts is needed when the account suffers transformation such as deletion, rol permission changes, etc. For the rest of Scenarios We can work under the same User Account when allowed.

Now you see how it’s important to Slow down and Analyze this deeply to see how to classify our Test cases.

Once We gather some data, We need to see what we need from it.

Let’s take a scenario where we need to add a Photo to a User in our application. What do we need in terms of data to be able to create said Scenario?

Depending on the application behavior we might need an active user for example.

Ok so we need 2 things: A User Identifier and make sure it’s active.

What we already have is the user Identifier. Now if it’s active or not it’s something that we can manage.

So here we have two steps: First We Filter Data looking for any User and The Second Step would be to Transform said data to what We need for our test.

So We have the first part of our Implementation The Filtering Data Structure, We need to create a Structure in Our Automation Project that will be in charge of extracting the Basic Data for our Scenarios. You can do that through APIs, Databases, files, etc.

And Also We need some Transformation Data Structure, where We can apply these actions or properties which are specific for our test case. The best approach here is to do it through processes being managed by the development team. I.e. If we have a Service for Activating a User in our application. We should use that service to do it. In this Example said Service is managed by the development team and if something changes in the database they will change the service to handle the DB change. Database handling could be another approach if you have the expertise to do it and also the support from Development. It’s not a good Idea to change something in the database without knowing what else could be affected.

Finally, To speed things up we might need more than one Data Test Object to be able to Run Test Cases Concurrently, to do that we can play with our filters evaluating the amount of registers we are retrieving and create new ones based on The Basic Criteria needed for The Instance under Test and add an event to create Data from scratch based on processes Managed by the development team as We did for applying actions or properties to our filtered data.

Conclusion

The fact of filtering data from the Initial data feed allows us to work with random data since we are only entering the basic criteria for every single test and we can get any result at the time of filtering. This forces us to write Robust test cases not subject to data but to flows.

This approach will help us to deal with discrepancies between environments in terms of data. Never hard code and always worry about the Test Flow Only.

Take into account we are following this approach based on the assumption that The Data Under Test is not sensitive. In said case We should think of strategies related to Data Obfuscation such as Encryption, Tokenization, Data Masking, etc for lower environments even if those environments are controlled.

There are several Data Management Strategies that we may apply depending on the kind of testing we are doing such as Stubbing and Mocking data, processes such as Interception through Proxies, Test Environments Virtualization where we can create new data every time and refresh it before testing, etc. Think of the Approach that suits you best and Happy Testing!!!

--

--