Sandman: a cloud-based platform for facilitating scalable, agile agent-based modelling

Run models at scale to better develop, validate, and explore them.

“The cost and speed of running [agent-based models] has been revolutionized. There has been a revolution, every bit as significant, in the availability of data to calibrate these models.” (Andy Haldane, Chief Economist, Bank of England)

Introduction

There is a growing interest in developing agent-based models (ABMs) for supporting decision-making and policy evaluation. This is in part due to the increasing availability of data to input to models and to empirically validate against. However, these models are increasingly complex, requiring substantial computation to develop and validate. To meet these demands, public cloud providers (such as Amazon Web Services (AWS) and Google Cloud Platform) offer good solutions for elastic compute and storage, but there are a number of barriers to their adoption. Sandtable have developed a cloud-based platform to remove these barriers, so that agent-based modellers can focus on modelling, not managing cloud infrastructure.

Agent-based modelling in the cloud

Typically, ABMs are run on commodity workstations, such desktop machines or laptops with, for example, 16 logical cores and 16GB RAM; or, where available, using High Performance Computing (HPC) resources such as Linux clusters. HPC resources are, however, generally available to those working at universities, are often shared resources with long job queue times, and expensive to setup and use.

As more data is made available and more complex models (e.g. more agents, more behaviours) are developed, the computational requirements for model development and validation far exceed those of desktop machines. To meet the increasing computational demands, public clouds offer viable solutions. These services offer low-cost, on-demand, elastic compute and storage. For compute, services such as AWS EC2 offer a range of instance types from 1 to 128 logical cores and up to Terabytes of RAM for only $s/ hour. For storage, services like AWS S3 provide highly scalable, available and reliable storage at low cost.

The benefits of using cloud services include:

  • No capital expenditure;
  • No long-term contracts or up-front commitments;
  • Very cost effective: pay for what you use;
  • Vertical and horizontal scaling: wide range of instance types, small to very large;
  • No need to purchase, maintain, and upgrade desktop machines; and
  • No need to use shared, expensive HPC resources with long job queues.

Barriers to cloud adoption

While cloud services are available for everyone to setup and use, there are still a number of barriers to doing scalable, agile agent-based modelling. Most of these relate to the skills and experience needed to set up and manage cloud services, and to use them effectively to run models at scale.

Provisioning and managing cloud resources

The first barrier is actually setting up resources and managing them effectively. There are a number of challenging tasks to do here, each with its own difficulties and pitfalls:

  • Provisioning resources. That is, working out how much compute resource is going to be needed, and provisioning instances with the relevant number of cores, memory and storage. This should include working how to bring resources up and down when needed in order to manage costs, and scaling resources up and down in accordance with demand.
  • Configuration management. This means setting up the resources you have obtained so they are fit for purpose for the task at hand. A related challenge is in figuring out which parts of the environment should be shared across multiple users and which dedicated to individual users, tasks or projects.
  • Networking. Making sure the resources can communicate with one another and that they are securely accessible to you and your partners.
  • Security. Figuring out how to secure your data and protecting your infrastructure from intrusion.
  • Maintenance. This includes having visibility of what’s going on across your resources, upgrading where required, and early identification of any issues.

Managing compute environments

This means setting up the resources you have obtained so they are fit for purpose for the task at hand. It includes ensuring that the compute environments on your resources have all the libraries and dependencies in place to run your models and associated software.

Running large-scale ABM workflows

Successful agent-based modelling relies on a number of workflows, each of which will need to be provisioned and supported. These workflows will need to be implemented in a scalable way and their performance monitored. Workflows include:

  • Monte Carlo simulation;
  • Parameter sweeps & sensitivity analysis;
  • Parameter calibration & estimation;
  • Model comparison & selection;
  • Scenario exploration & policy analysis.

Data management

ABMs require and produce large quantities of data. That data needs to be managed effectively. Your cloud setup will need to take care of managing model input and output data, and support auditable data processing pipelines.

Model deployment for exploration

One goal of modelling for decision-support is to develop models that can be deployed ‘into the wild’ and integrated for use with other applications. To do this, your cloud setup will need to make provision for the deployment of models into production, with proper checks and balances to ensure they are ready to be deployed, and proper monitoring of their behaviour and performance.

Can we remove these barriers?

Sandman

Sandtable is a leading commercial agent-based modelling company based in London. We have been developing ABMs for over a decade, for the public and private sectors and across a number of verticals.

Over the last 6 years, with the help of Innovate UK, Sandtable have developed a cloud-based platform, Sandman, that removes the aforementioned barriers. Sandman facilitates the scalable, agile development and deployment of data-driven agent-based models. Now in its third generation, Sandman has been used to do millions of hours of simulation with clusters of 1000s of cores and Terabytes of RAM. The goal of Sandman is to enable agent-based modellers to build better models faster.

We’ve been using Sandman for our own modelling and we believe others would benefit from it too. We plan to offer it as a managed service so that others can focus on modelling, not managing cloud infrastructure.

Design principles

Principles guiding the design and development of Sandman.

  • Ease of use
  • Flexibility
  • Reproducibility
  • Collaboration
  • Scalability
  • Performance
  • Security

Services

Sandman offers a number of services to remove the aforementioned barriers.

Provisioning and managing cloud resources

The compute and storage resources required for running models and ABM workflows can vary substantially. We have written a service that allows users to start and stop compute resources as and when required. Users can choose which and how many instances based on the compute and memory requirements. It is also possible to scale instances vertically (sizes of instances) and horizontally (numbers of instances) on-demand. We manage the underlying cloud infrastructure.

Managing compute environments

As well as different compute requirements, models have different dependencies or libraries. We have written a service that allows users to configure and manage their compute environments. We have included several default environments to start with. Users can also reuse and share these environments with other users.

Running large-scale ABM workflows

ABM development involves the execution of a number of workflows such as, parameter calibrations, sensitivity analyses and model explorations. We have developed a cluster framework for reliably running large-scale workflows across cloud resources without requiring any parallel or distributed programing. The workflows are described by computational task graphs, where, for example, a task in the graph could be running a ABM once or processing some data. The graphs can be developed and tested locally, and then run on the cloud.

We provide a lightweight Python SDK for building computational graphs. We use Python because it is a mature general-purpose language that is both easy to learn and highly productive. Python is the emerging de facto language for data science; has a very active community; and has an extensive open-source ecosystem, for example, libraries such as numpy, scipy, pandas, scikit-learn, matplotlib and jupyter.

While workflows are written in Python, the framework is agnostic to the ABM framework used. We have integrated several open-source ABM frameworks:

You can use these, integrate your own, or start from scratch.

Data management

We have developed a service that facilitates the management of data, including the storage and sharing of data between users. Additionally, it is possible to define data processing pipelines for the preparation of data for input to models.

Model deployment for exploration

We have included a service that facilitates the packaging of models for deployment and integration with 3rd-party applications. This includes the management of the infrastructure for deployment and exploration of models.

User Interface

We have developed a web interface for the easy management of compute resources. We will extend the interface to other parts of the API in due course.

Observability

We have also included best-in-class services for aiding in monitoring and observability for:

  • Metrics
  • Logging
  • Tracing
  • Alerts

Summary

There is growing interest in ABMs for supporting decision-making and policy evaluation. Increasingly, data is available to build richer and empirically validated models. Model development, in particular model calibration, is compute- and data-intensive. Cloud computing offers a good option to support scalable, agile model development. However, there are a number of barriers to using cloud effectively. Over the last few years, Sandtable have built a cloud-based platform, Sandman, to overcome these barriers, so agent-based modellers can focus on building better models faster, not cloud infrastructure.

We’re releasing Sandman as a closed beta. To register your interest, visit here.