Advantages of Using Static Data Generators in Unit Tests

Random and static data both have a place, but it’s important to pick the right option for your situation

Ali Haugh
Slalom Build
7 min readJan 23, 2023

--

unit testing

When unit-testing a service or application, the functions can take in complex objects that are tedious to define for each test case. To decrease code duplication and the time needed to write unit tests, developers can employ generators to easily create pre-defined objects, pass them into functions, and make assertions on the function’s output. In this article, I will walk you through two examples of generators and share my thoughts on when to use one over the other based on my experience with each.

Generators explained

First, let’s take a look at a simple example of a generator written using TypeScript for an application for selling cars. Let’s say we have a car service that has a function which calculates the price of a car based on the combination of properties that car has, and the car object is defined like this:

enum CarType {
Pickup,
SUV,
Sedan,
}

enum FabricType {
Cloth,
Leather,
}

export class Car {
serialNumber: string,
// type, passengerCapacity, interiorFabric, and year
// are used to calculate price
type: CarType,
passengerCapacity: number,
interiorFabric: FabricType,
year: number,
}

A generator could be written for a generic pickup truck that looks like this:

export class generateGenericPickup(): Car => {
return {
serialNumber: "111111111111AAAAA",
type: CarType.Pickup,
passengerCapacity: 2,
interiorFabric: FabricType.Cloth,
year: 2000
} as Car;
};

The example generator above is a function that can be called from within a unit test to create a Car object, allowing the unit test to focus on passing a Car object into a function and make an assertion on the returned value. Without this generator, each unit test would be defining its own Car object, leading to code duplication.

Generators with static data

This static object generator is great for most cases in my experience. When testing new functionality, it aids in specifically testing the happy path(s) as well as specific edge cases, making each test very intentional and reproducible.

The example generator defined above creates an object with the same values each time — static data. By using this type of generator, we can be intentional with the case that we are testing in each unit test. This is done by either testing the object that is returned from the generator without altering any of the values or altering parts of the generic case to fit the case being tested. Below are examples of these types of tests using the car example:

it('should handle generic pickup', () => {
const pickup = generateGenericPickup();
const price = service.calculatePrice(pickup);
// make assertion on expected price
expect(price).toEqual(<calculated price>);
});

it('should handle new pickup', () => {
const pickup = generateGenericPickup();
pickup.year = 2022;
const price = service.calculatePrice(pickup);
// make assertion on expected price
expect(price).toEqual(<calculated price>);
});

it('should handle invalid year', () => {
const pickup = generateGenericPickup();
// calculatePrice was implemented to throw an error
// when it receives a year before 1886 since no cars
// were built before then
pickup.year = 1850;
// make assertion on expected thrown error
expect(service.calculatePrice(pickup)).toThrow(<error message>);
});

In the first test, the object that gets returned from the generic pickup generator is sent into the calculatePrice() function without changing any of the values. In the second test, we can test the case of a brand new car by creating a Car object using the generator and setting the year to be a recent year before sending the Car object into the function. These first two tests are what I would consider happy path tests. The last test in the example above is a sad path test (also known as an edge case). We are testing that when we set the year the pickup was built to an invalid value, the function will return an error. Tests that rely on static objects are beneficial to add when adding new functionality to a service or starting a new service by testing specific cases that you know you can expect.

The main drawback that I have encountered with this type of generator is testing cases that the developer had not considered. With unit tests that use a static object generator, the developer must be specific in the cases that are being tested and can occasionally miss edge cases that could cause the code being tested to fail out in the wild. This is where generators using random data are strongest, so let’s discuss those now.

Generators using random data

When an object generator creates random data, it looks similar to the example generator above, but instead of specific values being set in the object, random values are generated and returned. This type of generator has two general uses. The first is using a seed value so that the values that come back from the generator are the same each time. The second is allowing the generator to generate different data each time and running the tested code many times against many different inputs. I will go through examples of these below.

There are a variety of third party packages that can be pulled in as a development dependency, and in this example, I am using Faker. The Faker node package is a common way to generate random data when using JS/Typescript. Faker makes it possible to easily generate a variety of fake but realistic data, such as names and addresses. Faker also allows you to set a seed value that will provide consistent values between test runs. When using Faker to create random data for our Car model, a generator can be written to build a pickup truck that looks like this:

export class generateRandomPickup(seedValue?: number = undefined): Car => {
if (seedValue) {
faker.seed(seedValue)
}
return {
serialNumber: faker.random.alphanumeric(17),
type: CarType.Pickup,
passengerCapacity: faker.random.arrayElement([2, 4, 5, 6]),
interiorFabric:
faker.random.arrayElement(Object.values(FabricType)),
year:
faker.random.number({min: 1886, max: new Date().getFullYear())
} as Car;
};

An important detail to note here is that each field will get a value of the correct type (the serialNumber will get a string, interiorFabric will get a FabricType, etc), so I gave Faker some constraints around what the values could be.

This random data allows us to test a variety of different data inputs to the price function with a single generator function. Being able to easily test the results of different inputs to the function ensures that said function is robust and can handle a wide range of cases that we might not have considered, like extra long strings and arrays or larger/smaller numbers. Below is an example of a unit test using the random pickup generator and sending a seed value:

it('should return calculated price', () => {
const pickup = generateRandomPickup(10);
const price = service.calculatePrice(pickup);
// make assertion on expected calculated price
expect(price).toEqual(<calculatedPrice>);
});

A couple of drawbacks come with the random generator in the above example. First, the developer will need to initially run the test to understand the data the seed value will generate, then add an assertion based on those values. Second, this test will only cover one set of inputs to the function being tested which, in this case, will not cover all the cases that should be tested. Adding additional tests to cover those other cases can be difficult because the developer will have to find the right seed value to produce the right set of data. In my experience, if you end up here, using a static generator instead is the best path forward.

Below is an example of a test that allows the random generator to generate different each time:

it('should return calculated price', () => {
for(let i = 0; i < 100; i++) {
const pickup = generateRandomPickup();
const price = service.calculatePrice(pickup);
// make assertion on expected return type
expect(price).toBeOfInstance(number);
}
});

In the example above, the price function is tested 100 times with different input each time. The type of tests allows the developer to test that the function is robust and can handle a wide variety of inputs, including combinations that the developer might not have considered.

There are a couple of drawbacks though. First, each time this test is run the inputs to the tests will vary, and one of those inputs can cause the test to fail. For the developer to see which input caused the test to fail, each input would have to be logged before sending it into the code—which will create a lot of noise in the test outputs as they run.

Next, the unit test is unable to make a simple assertion on the exact value of the price that is returned. So instead of making an assertion on the exact value, we make an assertion on the type. This test will fail if the type of the returned value is incorrect or if the input caused an error.

Finally, the code that is run will vary from test run to test run because of the varying input producing varying coverage reports and could have random test failures. The varying coverage reports can result in random testing failures if the service has coverage minimums enforced and the test run fell below that minimum. Similarly, the tests can randomly fail if the code being tested was not able to handle in the input on a certain run.

In conclusion

Static generator pros:

  • Easily test happy path(s) and specific edge cases
  • Easily reproducible

Static generator cons:

  • Edge cases that were not considered are not tested

Random generator pros:

  • Easily test a variety of inputs, including some that might have not been considered

Random generator cons:

  • When using seeding, each test needs to be run first to know which input the test will get
  • When not using seeding and a test fails, it takes extra setup and inspection by the developer to understand which input caused the failure
  • When not using seeding, the code that is being tested can change from one test run to the next

Both types of generators discussed in this article have use cases where they can shine. Static object generators help to test specific happy and sad paths through the functionality, while random object generators can help to test a wide variety of unexpected cases. I would recommend first starting with static object generators when setting up new services or a new set of unit tests so the developer can focus on getting high code coverage every time the tests are run and cover the majority of the happy and sad paths through the code intentionally. Later, if you want to continue building out the testing, you can add in the random object generator so that more cases can be covered.

--

--