Open-Source in the City Modelling Lab

Why and how we consume and publish open-source software

Michael Fitzmaurice
Arup’s City Modelling Lab
5 min readSep 21, 2022

--

In the City Modelling Lab, we’re big fans of open-source software.

MATSim, the tool at the core of our Agent-Based Models, is an open-source project released under the GNU GPL license. We use a great many open-source frameworks and libraries for Python and Java, we’re using Ubuntu on our cloud virtual machines, and we often create our own Docker images by extending open-source images from Docker Hub.

It’s fair to say that open-source software is essential to our work.

Alpha Stock Images — http://alphastockimages.com

Why use open-source software?

We derive several practical benefits from using open-source software:

  • The software we use and depend upon is very well peer-reviewed and road-tested; MATSim has 300+ forks and almost 100 contributors, and pytest, for example, is used by well over half a million other public projects on GitHub
  • We can dig into the code to further our understanding of the problem space
  • We can make our own changes to the software
  • We save money on license fees (although this must be offset against dev costs)

Why publish open-source software?

We don’t just want to be consumers of open-source software; we also strive to contribute to the community wherever possible. These contributions take several forms. For example, if we find a bug in an open-source tool we’re using, we try to contribute a fix. But we also release our own software under open-source licenses wherever possible. We do this for both philosophical and practical reasons.

Philosophically, since we derive so much value from the open-source efforts of other people, we feel it’s only right that we try to repay that in some small way. Plus, we recognise that the solutions to many of the big problems of our age - climate change and inequality, to name two that our work touches - require collaboration across multiple domains.

We believe collaboration is most effective when it is open and transparent and that credibility, accountability and transparency are all vital as we use ever more complex and powerful models to make decisions in the public space. With more power comes more responsibility…

Practically, when we release our code publicly under an open-source license, as well as giving back to the community, we also gain some useful things:

  • Our clients appreciate the transparency and lack of potential vendor lock-in
  • We invite community peer review; more eyes on the code is a good thing
  • We are forced to think like users of our own software rather than just developers; this user-focused perspective encourages some good practices that can otherwise be neglected
  • We create the potential to recruit new collaborators, giving us access to talent, expertise and input from far and wide
  • We raise our profile in the transport modelling and city planning communities, helping us to build relationships with like-minded people doing similar work

Releasing our software

We want to give our users the best experience possible in using our software. We aim to minimise the time and effort required for a new user to understand, install and use our tools.

To this end, we follow a process based on an agile software lifecycle adapted from the UK Government Digital Services Agile Delivery Manual. As a software component moves through the different lifecycle phases, from Discovery to Live, we must tick more boxes in our checklist. By the time we publicly release a repo, it adheres to all of our best practices.

Our software lifecycle/quality gates checklist

The upshot is that our GitHub repos all look very similar in some ways:

  • We believe in good documentation, so you will always find a comprehensive README.md file in the project’s root (see this example from PAM)
  • For libraries, we like executable tutorials that teach our users how to use the API, as you can see in PAM and GeNet
  • We make it clear how our users can contribute via a CONTRIBUTING.md file and a clearly defined code of conduct, as evident in GeNet
  • We make the functional correctness of our code verifiable through a robust suite of unit tests, plus we often include further automated tests such as these smoke tests for GeNet’s Jupyter notebook examples
  • We like the permissive open-source MIT license, which you will find front and centre in LICENSE.md, making all conditions and limitations of usage crystal clear
PAM’s MIT Licence
Automated smoke tests for PAM’s Jupyter notebooks

Our open-source portfolio

We tag all of our GitHub repos with the CML topic so you can find them all from the URL https://github.com/search?q=org:arup-group+topic:cml&type=repositories and go from there.

All of CML’s open source projects on GitHub

All of the things we have released so far are related to our work with Agent Based Models, one way or another. A quick summary:

  • PAM: Tools for programmatically generating and modifying transport demand scenarios
  • GeNet: Tools for programmatically manipulating graph-based representations of public transit networks, such as the ones used as input to MATSim
  • OSMOX: A command line utility for extracting facility locations and features from OpenStreetMap (OSM) data
  • Elara: A command line utility for analysing large, semi-structured MATSim output data, and calibrating/validating models
  • MC: A command line utility to make MATSim configuration less painful
  • Londinium: A semi-synthetic MATSim dataset for creating models that are realistic, but still just about small enough to run in reasonable time on a typical developer laptop

Coming soon…

We’re certainly a bunch of ABM-heads, but not everything we build is ABM-related.

An example of a generically useful piece of software we created is an AWS Lambda application called AEMon that turns events in an AWS account into notifications in Slack, our internal messaging and collaboration system of choice. We’ve been using AEMon to good effect across a number of Arup teams for a couple of years, and it already ticks all the necessary boxes in our quality checklist. We will be releasing it publicly soon.

Although we cannot publicly release everything we build, we’re definitely not done making open-source contributions just yet, so… stay tuned!

--

--

Michael Fitzmaurice
Arup’s City Modelling Lab

I'm a software engineer in Arup's City Modelling Lab, where we use agent-based models to help improve transport and cities.