A Perl toolchain for building micro-services at scale

Published in

Engineering@Semantics3

7 min readJun 15, 2016

But why?!

This is the question that we get asked immediately when we tell someone that we use Perl. Our extensive use of Perl to build many of our internal services often comes as a surprise to many and we can understand why. Perl is a dinosaur among mainstream programming languages. It lacks the glamour that other, relatively younger languages have. There is also a common misconception in the programming world that modern software engineering practices cannot be followed with a language like Perl. In this post, we hope to debunk that myth. We want to give you a glimpse of the developer experience here at Semantics3 where we write a lot of Perl code but still manage to employ the latest engineering best-practices. We would like to highlight that we are able to do so with the help of a tool-chain written entirely in Perl.

The Burden of a Monolith

We must admit that our development process wasn’t always the way that it is today. We started off with a single code-base like most other startups. We used system Perl (the version that ships with the OS) and installed dependent modules globally. We were aware of the pitfalls of this approach but found the convenience that it offered, in helping us ship fast in the early days of the company, very valuable.

However, best practices are best practices for a reason. Two years in, in 2014, we had a few problems — problems that we had brushed aside earlier but which were now staring at us in the face:

Deploying code across our fleet of machines had become very slow. This was largely because of the tight coupling between our code and the environment in which it was running on. We were starting to get very conservative about upgrading dependent modules because we didn’t want existing scripts/services to break. This was also a problem because for any given module (as it was installed globally), we didn’t know who its dependents were
It was getting difficult to on-board new engineers into the team quickly. To be fair, our monolithic codebase was fairly modular and it was not that hard for people who were familiar with it (we were a 5-person engineering unit then) to build on/extend it. However, it was intimidating for new team members. It was hard to even work on a small part of the codebase as it was not possible to run a script/service in isolation. They were forced to use our shared development environment which could get some time to get used to.
We were wary of software erosion i.e. while our services/scripts were working at that point, we couldn’t fully discount the possibility that a future change in environment (and not the code!) in a direct or indirect way could bring down some critical services — which we couldn’t afford. While tests let you catch breakage like this, working on a fix at a given point in time is a cost that you may not want to incur.
The whole system was becoming untenable. As a result, we found ourselves often fire-fighting and being unable to add new features to our product at the rate at which we would have liked.

Moving to Micro-services

We were fully bought-in by the promises of a service-oriented architecture. Its core philosophy resonated well with us considering the challenges that we were facing then. We decided to make the switch taking on a massive refactor of our codebase in the process.

The Developer Experience (DX)

We started by first defining the developer experience that we wanted to achieve. We wanted a workflow that was tuned for micro-services and capable of scaling up with the team. This is what we came up with:

Each code artefact (a library, script or service) should be owned by one engineer. It should do one thing and one thing only (Unix philosophy).
Each code artefact must be checked in to its own VCS repository and must have tagged releases with semantic versioning.
The experience of authoring a code artefact must be both seamless for an individual developer and uniform across the team with tooling present to take care of scaffolding boilerplate code, identifying dependencies, packaging, testing and releasing. The tooling available should enable a single engineer to own multiple code artefacts easily.
Private libraries that we write must be installable from services in exactly the same way that third-party libraries are.
Each code artefact must declare its dependencies (code and environment) explicitly i.e. clean contracts. Ideally, any developer must be able to run any script/service on their development environment easily.
An engineer’s development environment must be as isolated as possible so that we don’t step on each others’ toes.

We knew that before we embarked on the refactor, we had to find the right tools to support this experience that we envisioned. We followed Larry Wall’s advice and looked for solutions in Perl.

In general, if you think something isn’t in Perl, try it out, because it usually is — Larry Wall

And right he was! We were pleasantly surprised to find tools in the Perl world that helped us solve each of our use-cases above. Below we’ll cover each of these tools reiterating the problem first before proceeding to explain how the tool solves it.

Pinto

Problem: Private libraries that we write must be installable from services in exactly the same way that third-party libraries are.

Public Perl modules are housed in CPAN. Pinto is a package repository server for Perl that we have deployed locally. It let us have our own private CPAN. Instead of starting off as a CPAN mirror though, is starts off empty and pulls the packages that it doesn’t have cached locally (when requested) from CPAN. It is compatible with cpanm. We fondly call our Pinto server - Darkpan. We push our private packages to the Darkpan using the pinto client.

Carton

Problem: Each code artefact must declare its dependencies (code and environment) explicitly i.e. clean contracts. Ideally, any developer must be able to run any script/service on their development environment with minimum setup overhead.

Similar to Ruby’s bundler, Python’s pip and Node’s npm, Carton is a dependency manager for Perl. It lets us include a dependency manifest called a cpanfile in each of our code artefacts that described the libraries that they depend on (both private and public). Dependencies are installed locally with carton install and the execution helper carton exec helps run services and scripts in the context of the installed libraries (it basically adds the path of the locally installed dependencies to Perl’s list of includes).

As Carton is powered by cpanm under the hood, it is straight forward to hook it up with our Darkpan. We just set the PERL_CARTON_MIRROR environment variable.

Minilla

Problem: The experience of authoring a code artefact must be seamless for a developer and uniform across the team with tooling present to take care of scaffolding boilerplate, identifying dependencies from code, packaging, testing & releasing. The tooling should enable a single engineer to own multiple code artefacts easily.

Minilla initialises a git repository and generates the base directory structure and files required for a new code artefact. It also allows for running tests, preparing an release and maintaining a change-log automatically.

Perlbrew

Problem: The development environments of each of the engineers on the team must be as isolated as possible so that we don’t step on each others’ toes.

We removed our dependence on System Perl with perlbrew. Perlbrew installs Perl in user land and also allows switching between multiple Perl versions easily.

Although we still have a shared development environment, each of us have our own separate perlbrew installations. We can now install a module globally without inadvertently upgrading it if it was already being used by someone else.

With perlbrew and carton, we have good-enough isolation for running our scripts and services in development. In a later post, we will talk about how Docker gives us the run-time isolation that we needed in production.

perlbrew also allows us to easily test our script/service against a newer version of Perl thereby giving us the option to upgrade should we find the need for it.

Github

Problem: Each code artefact must be checked in to its own VCS repository and must have tagged releases with semantic versioning.

We were already using Github at that time and were loving it. Each code artefact having its own repository, however, meant that we would have a ton of private repositories which Github’s then pricing model didn’t support out-of-the-box i.e. it was insanely expensive that it did not make sense. So, we spoke to them and switched to their per-seat pricing structure where we pay per-user instead of per-repository (Github has since changed their pricing model). Once we had done that, we were free to create as many repositories as we wanted. As Ramanan Balakrishnan pointed out in an earlier post, this was and continues to be a power that our engineers love to wield.

git allows for tagging releases easily and we use Github Releases sometimes for notifying team-mates about major releases.

Mojolicious (Special Mention)

Mojolicious is an asynchronous, real-time web framework that is very pleasant to work with. It provides an idiomatic, easy-to-use API to build a light-weight web service quickly. We use it for most of our services. It also has a rich eco-system of plugins.

sem3 build

While these tools fit our needs nicely, we felt it would be better to abstract our engineers from working with them directly. We achieved this by building a command line utility called sem3 build that wraps around these tools.

This is how our development workflow looks like today with sem3 build:

Scaffold a new project:
$ sem3 build new MyModule|my-service|my-script
Install a new dependency:
$ sem3 build install YourModule
Try to identify dependencies from the code and save them to the project’s cpanfile:
$ sem3 build deps save
Commit and push intermediate changes repeatedly:
$ git commit -m "Something changed" && git push origin HEAD
Run tests:
$ sem3 build test
Bump version, tag and push a new release to both Github and Darkpan. The switches indicate how the version must be incremented. -x indicates a major version change, -y indicates a minor version change and -z indicates a patch version change:
$ sem3 build -x | -y | -z | -noversionbump

Summary

Pinto, Carton, Minilla, Perlbrew, Mojolicious together form a nifty Perl toolkit for a seamless micro-services development experience. Our engineering team has quadrupled in size over the last two years and together we manage over a hundred micro-services in production but this toolchain continues to serve us well. We hope you will find it useful in your Perl adventures too!

The world has become a larger place. The universe has been expanding, and Perl’s been expanding along with the universe — Larry Wall