Mix-and-match websites using Rails engines.

Agnieszka Figiel
Vizzuality Blog
Published in
8 min readApr 30, 2019

--

Climate Watch, the open data platform which brings together dozens of datasets, offers powerful insights from global data on climate change. However, what it fails to convey is progress on specific national commitments which are not captured in the global view.

In order to be able to demonstrate the full breadth of the activities they undertake, we set out to build national “Climate Watches” for two pilot countries—Indonesia and South Africa. The expectation was that those platforms would be visually and structurally very similar to the global platform, and we could reuse much of what had already been developed for Climate Watch to fast-track development. Take a chart from here, add a new one, copy an entire section and put it alongside new content — sounds like an exercise in designing your own modular furniture. Except Climate Watch was never built with this kind of mix-and-match approach in mind, so the first task that we had to tackle was to decide how to retrofit this ability.

We had ideas about extracting reusable components from the frontend code. This text deals with our approach for reusing functionality from the backend, which is a monolithic Rails application handling the API, admin panel and data importing logic.

Lombok, Indonesia. Photo by Atilla Taskiran on Unsplash.

Extracting existing code for reuse.

The first decision we made was that we wouldn’t keep a single codebase for the global platform and the national ones. It would not be possible or practical to maintain a single generic application that could be configurable, allowing a user to set up a new instance of the application and adjust by turning modules on and off. We were going to end up with several separate applications, but we wanted to avoid duplication of any common code. Therefore, in order to take advantage of the code already developed for the global platform, we had to find a way to extract it and make it available for the country platforms.

We analysed the architecture of our Rails application to see how much coupling was in place and how difficult it would be to extract the common parts. It soon turned out that there are some clear separation lines between parts of the code focused on particular subsets of the data. For example, if you navigate to a particular country page, you can see tabs with historical emissions or NDCs (Nationally Determined Contributions) for that country, which are examples of datasets from which Climate Watch data is composed. The same organisation by dataset is present throughout the code base of the Rails application and its data structures.

The following diagram shows how data flows through this system.

Data is ingested via an admin panel , which is a typical ActiveAdmin interface. It is divided into sections, each of which allows you to upload files for a particular dataset. The fields on the upload form and the respective validation logic are dataset-specific. Positively validated files land in an S3 bucket, organised in directories named after the datasets they belong to.

Next, data is read from S3 and processed by importers. This happens asynchronously and in background, as some of the files take a long time to process. The importer classes are namespaced by dataset, for example HistoricalEmissions:: or NdcContent::. They take care of parsing files, validating data and storing it in the database.

Data access and persistence is managed by ActiveRecord models, also namespaced by dataset. The underlying database tables have their names prefixed by dataset name, for example historical_emissions or ndc_content (which works well with namespaces on the models). There are no cross-references between tables across datasets, only within (with the exception of the locations table, which is a globally shared dictionary of countries).

Finally, the web API reads data from the database and serves it to the frontend, formatted in a way to easily populate the various visualisations on the website. The routes to the endpoints are namespaced by dataset here as well.

I used colours to show how we could easily slice through each layer of the application by dataset. It would be easy not only because everything is consistently namespaced; it’s also because it’s conveniently naturally decoupled. For example: tables and ActiveRecord models are cross-linked only within a dataset, and not across datasets (except for the special case of the locations table). Data importers are independent of each other, as they implement specific dataset’s logic. Finally, the web API was designed to align with sections of the website, with separate endpoints for each of them.

Interestingly, the potential for splitting the code in this way was not our objective at the beginning of the project when we introduced this organisation. It was because we envisaged we would end up with lots of models, many of which would have the same name, but different meaning in different datasets (for example “sectors”), so we established namespacing on the database object level and applied it throughout.

Packaging code.

It looked like it would be possible and relatively easy to vertically slice the monolithic Rails application. If only there was a way to package such a slice into some kind of pluggable building block, that could be easily applied to other applications! That, in essence, is exactly what Rails engines do.

Rails engines are effectively gems, the standard for packaging code in ruby. That means they can be used to share code easily across Rails applications, using the same mechanisms that we use for managing other dependencies, such as bundler. But they are also miniature Rails applications in themselves, so they can contain code from all layers of a Rails application which attaches itself almost seamlessly to the host application. A well known example of a Rails engine is the authentication solution Devise. With a few lines of code it adds models, controllers and views to fully handle authentication, with numerous customisation options.

Extracting code into engines from Climate Watch turned out to be conceptually very easy, mostly because code was already namespaced throughout. Therefore, every extracted namespace could very easily become an engine. There were a few obstacles to tackle though.

First, the API controllers—like in many Rails applications—relied on functionality from the base controller, such as setting http headers or providing customised error responses. There was not a lot of that shared functionality, but it was important to keep it consistent across all platforms without having to copy and paste. Ideally, there would exist only one version of the ApiController across all the applications and all the engines, so the first instinct was to create a separate engine with the base controller, which would then be a dependency for all applications and engines. However, that didn’t feel right, because inheritance comes with a lot of risks for future maintainability, so having the base class of an inheritance tree external to the application would open the door to technical debt. Therefore, we decided against a single base class and instead packaged the shared bits of functionality into modules, which need to be included by the base controllers. Those modules ended up in a separate engine, alongside other utilities (such as logging or S3 access).

module Api
module V1
class ApiController < ActionController::API
include ActionController::MimeResponds
include ::ClimateWatchEngine::Cors
include ::ClimateWatchEngine::ExceptionResponses
include ::ClimateWatchEngine::Caching
end
end
end

Next, we needed to deal separately with the only shared database table / ActiveRecord model which represents locations. Conceptually it is also a separate dataset, with its dedicated admin panel section, importer and API endpoints. Therefore, it too was packaged as an engine. The difference from other datasets is that this one is referenced not only from host applications, but also from all the other engines which depend on it. Things worked mostly as expected when running the applications, but all hell broke loose when running the automated tests. We use rspec with FactoryBot for tests. In general, there are a few surprising things to remember about when writing specs for engines, such as explicitly having to point controller specs to the routes. Those are covered in the documentation. The bit we got stuck on was using factories from the Locations engine in other engines’ specs. We ended up loading them explicitly in rails_helper.rb of the host engine:

# Add additional requires below this line. Rails is not loaded until this point!
Dir[Locations::Engine.root.join(‘spec’, ‘factories’, ‘*.rb’)].each { |f| require f }

Finally, we struggled with how to publish the engine gems. We didn’t want to publish them via ruby gems, but instead used bundler’s ability to load them directly from git repositories. The natural way to do it would be keep each engine in its separate git repository, but that was something we wanted to avoid, as we originally thought we would have many such engines, and we didn’t want to end up with an unmanageable number of repositories. We went for using a single repository with multiple .gemspec for all the engines. In the end this turned out to be false economy, as we didn’t end up with that many engines as we envisaged, but we did end up with a lot of problems with versioning gems!

Not quite IKEA, but still a powerful addition to our toolbox.

As we got familiar with using engines and progressed with our work on the platforms, we spotted an opportunity to cut down on code duplication in the admin panel. The ActiveAdmin controllers (which handle file uploads) were extremely repetitive, and we realised we could easily generate them instead of writing them, allowing us to add new ones very easily as new datasets are integrated. The generator was built as a Rails engine, and successfully applied to the global platform and the national platforms.

Even though in the end we didn’t manage to extract as many reusable “slices” of backend code as we had hoped, Rails engines proved very useful for decomposing the original Rails application for reuse. Given it was the first project in which we applied engines, they proved relatively easy to work with, mostly because the documentation is excellent. The fact that in the course of the project we came up with another application for engines to build a configurable ActiveAdmin back office generator proves how flexible the solution is.

The downside of extracting code into engines is having to manage versions of engines and the gems they require as the host applications evolve. For example, upgrading core dependencies such as ruby or Rails needs to be applied not only to the applications, but the engines as well. Dependency management is, however, not a problem specific to engines, but maintaining ruby gems in general, which we didn’t have experience with before.

The end result? Have a look at the country platforms for Indonesia and South Africa!

Agnieszka is a Senior Developer at Vizzuality with magic powers in Ruby on Rails and PostgreSQL. She takes an iterative approach to writing code: writing it, deleting it, then replacing it with better code until she’s satisfied it’s up to scratch. She’ll help you do it too if you ask.

--

--