Elegant and readable randomness using Faker

Alberte Mozo
Docplanner Tech
Published in
3 min readMar 1, 2022

As Fran Iglesias pointed out in a recent post, using random examples in your test suite is a great way of getting additional confidence from it; lowering the risk of writing code that only works for a tailored set of data.

I would like to add that this randomness also helps us to explicitly express that the values we are using are not important for the final outcome, but that our logic should work for a diversity of inputs.

However, depending on our tech stack and the kind of random examples we are generating, mocking up real data could result in verbose low level code that pollutes our test methods with technical gibberish.

Photo by Karsten Winegeart on Unsplash

Let’s see how to mitigate this problem with the help of Faker, a library written originally in Perl that has been ported for almost every programming language out there. Here we will use the PHP implementation by François Zaninotto.

The problem

Imagine a directory where we are going to store simple contact details.

Let’s say that our $id will be a Version 4 UUID, our $name will store the contact full name, and we will not enforce any specific format for $phone numbers.

Implementing the Object Mother pattern for this simple class could result in something like the following.

This will generate a set of values like this;

$id    = '207367d3-cf28-68f3-6bfc-7fae10dc02f7';
$name = 'd50e7437e0527a57 aa1aa31e800232ea';
$email = '62f51c2cf76bf2b(...)@62f51c2cf76bf2b(...).com';
$phone = '106400039';

The random values generation could be improved, of course, but the general point is clear — for getting values that are somewhat realistic, we are forced to introduce in our method this kind of weird code that diverts our attention, adding unnecessary cognitive complexity in exchange for a result that, well… is far from optimal.

Furthermore, this generation logic can be tricky in some cases, consuming time, being hard to debug and maintain; so we will end up giving up and just generating plain and meaningless random values that are hard to read and just add noise to our test outputs.

Faker to the rescue

This handy library can help us to create our random data by providing generators for almost any use case you can come across.

Let’s see how our random() method could look like.

Our values would now be like the following;

$id    = '7e57d004-2b97-0e7a-b45f-5387367791cd';
$name = 'Dr. Zane Stroman';
$email = 'tkshlerin@collins.com';
$phone = '+27113456789';

The benefits should be obvious at a first glance, but let’s enumerate some of them;

  • Our generation logic is now readable, concise and explicit.
  • The quality of the data is much better than before.
  • Generated values will match typical real ones.
  • The readability of possible outputs will be improved.
  • We have saved ourselves a significant amount of coding time.

Of course, there is always a trade-off, especially since we are adding an external dependency to a class that should be clean and simple. However, depending on if a static factory is breaking your architectural rules, you can always push this dependency behind an interface of your own, injecting it into your Object Mothers or directly into your test cases if you are not making use of that pattern.

Not only for testing

Helping us with writing clean tests is not the only application of Faker. In fact, the project description states;

Faker is a PHP library that generates fake data for you. Whether you need to bootstrap your database, create good-looking XML documents, fill-in your persistence to stress test it, or anonymize data taken from a production service, Faker is for you.

Probably, it is creating data fixtures for non-production development databases where this library shines the most. Faker provides adapters for many ORMs and ODMs out of the box.

Among its features we find the option to seed the generator with a known value, in order to get a predictable set of values on every execution. This also opens the door to track the seed used for every test execution, making it reproducible if something goes wrong.

Apart from this, the library itself has been designed with extension in mind, allowing us to write our own generators for domain-specific data or using many third-party extensions to create almost anything, from images to beverage names.

Definitely, it is worth a shot.

--

--