Testing Phoenix and Elixir: Centralize Data Creation in the Factory.
Jeffrey Matthias, the coauthor with Andrea Leopardi of Testing Elixir: Effective and Robust Testing for Elixir and its Ecosystem, hosted a recent talk on building maintainable test factories that I had the pleasure to attend.
This article expands on a point originally discussed in my previous article: How to Build Maintainable Test Factories in Elixir and Phoenix.
What is a Factory?
Factories in Elixir are modules that handle inserting data into your database. They can also be responsible for handling the generation of any test-related data. Ex Machina is a popular library for creating factories. However, you can also build your own from scratch.
This article assumes you have some knowledge of testing Phoenix and/or Elixir with a Factory. If you want a brief overview of factories in Elixir, I wrote about factories and other methods of seeding data in phoenix.
What is centralized data creation?
By putting all of your methods to generate relevant data for your tests in your factory or factories, rather than putting them anywhere in your codebase, you can centralize all of your data creation in a single place.
Why centralize data creation?
By centralizing data creation in the factory, you provide a single, convenient interface where all test data is created. So, for example, if you need to generate emails, names, addresses, or even booleans for your tests, you only have a single place to generate emails in the factory, rather than multiple different methods spread across the codebase.
Your factory can expose public methods for generating data that can be used throughout the codebase. This way, if the data changes, it only needs to be updated in a single place.
Incidental Data vs. Intentional Data.
Intentional data is data that your tests rely on the value of. You should explicitly pass intentional data into your factory method rather than relying on hidden values. Conversely, incidental data is data that the specific value of does not matter, so long as the value is valid. Consider randomizing incidental data, so your tests don’t rely on hidden assumptions and catch potential side effects.
How do you centralize data creation?
If you are working on a Phoenix or Elixir project and using factories, you may have multiple places where you generate the same type of data differently.
Create A Public Method in the Factory.
Some common data most applications need are emails, phone numbers, names, and addresses. You might have multiple ways in your application to generate these values.
For example, on my current project, we have multiple ways of creating emails in our factory.
Commonly we use ExMachina’s sequence method to create a list of emails (firstname.lastname@example.org, email@example.com, etc.) in order.
In the same project, we also sometimes use Faker for emails. Faker is a library that provides utility methods for generating fake data.
We also sometimes use static values for emails.
However, to centralize data generation, you ideally want a single method for generating a type of data. In this case, an email.
To start, I added the following method to our Factory. I chose to use Faker for emails instead of sequence because it provides more randomization.
And now, whenever we need to generate an email, we can use this method. Here’s the same user code from before but using the Factory Method.
side note: I haven’t yet extracted the name value, only the email.
Now the way we create emails is consistent throughout the application. If our understanding of valid and invalid emails changes, then we only need to change the data generation method in a single place rather than many.
Given this example with an email, you could do the same for any common data you use in your application.
If you have multiple factories in your application, you might consider extracting your data generation methods into a single factory template module so that your factories can share that code.
We are currently using a single factory in our application, but as we grow, we will split our large factory into smaller ones.
If you find yourself in that place, you might consider making a factory template module with your common data generation functions, which you can use in your separate factory modules.
I want to practice what I learned from Jeffrey’s talk, so I plan to start by centralizing data creation. From there, the next step is to randomize incidental data. We still have plenty of places where we use static data instead of random data.
I hope you learned something from my experience trying to centralize data generation in my current project! Also, I hope to share more of what I learned about writing maintainable factories for elixir applications in the future.