Model Parallelization in Various Languages & Approaches to Concurrent Agent Scheduling

Often I talk about various applications of agent-based modeling (ABM), but rarely are computational approaches discussed. At George Mason, a group of students and professors are working on a comparison of languages and frameworks for the parallelization of a simple agent-based model. At the same time, another student is looking at how concurrent agent scheduling might work.

It should be noted that this post is by no means a replacement for the nuance and detail of the full publications when they are released. These are gross summaries of research in progress.

A comparison of Languages and Frameworks for the Parallelization of a Simple Agent Model (Dale Brearcliffe, Peter Froncek, Marta Hansen, Vince Kane, Stefan McCabe, Davoud Taghawi-Nejad and Rob Axtell)

Most of the ABM frameworks are single threaded, but there is a growing desire to create parallel models. To explore this area Brearcliffe, et. al. is conducting a bake off among a variety of languages to determine which ones perform the best in a comparison of parallelization of the same model. Normally, parallelization is done using D-Mason or Repast HPC, but advances in computing are providing opportunities to partition agents into subpopulations.

To test the languages, a supply-demand model — zero-intelligence traders was created. The agents were initialized, then partitioned into subpopulations of buyers and sellers, then each ‘submarket’ was run to completion, and statistics were computed. This model was programmed in different languages by different people who may or may not be experts in the language they are programming in, but ran at independent times on the same multi-core machine. The languages evaluated included: Clojure, Erlang, Go, Haskell, Scala, C, C OpenMP, and Java. Not all the languages made it through to the end, because of programming approaches.

There is no true “winner” yet. More testing to grow the agent population (10⁹ or 10¹⁰) needs to be run before explicit conclusions can be made, because some of the languages were able to run on one core.

Communicating sequential agents: An analysis of concurrent agent scheduling (Stefan McCabe)

While Brearcliffe, et. al. is looking at how the programming language and parallel processing can affect the performance of an ABM, one of the researchers, McCabe, is digging into the impact of concurrent agent scheduling.

McCabe is dividing the way he activates agents into a variety of categories. How are they selected (aka selection order)? Is each agent guaranteed one activation per model turn (aka uniformity)? Does the model change immediately or do all agents act on information from the start of a turn (aka updating regime)? Can a given model be repeated consistently (aka reproducibility)? If the selection criteria for activation is based on some agent characteristic, can that characteristic change over the course of a model run (aka endogeneity)? Can the agents be activated in parallel (aka parallelization)?

To evaluate approaches to concurrent agent scheduling, McCabe evaluated projects located at OpenABM. And the results? McCabe said, “Most agent-based models update asynchronous. This is the key innovation over cellular automata.”

Make up of ABM scheduling from OpenABM.

He also noted that the most adopted means of documentation was the ODD protocol (~40% of ABMs), but the most used framework is Netlogo (~60%), which does not normally follow the ODD protocol. He also found that the vast majority of projects were using random selection order, followed by fixed.

Ultimately, McCabe is asking whether a model can be moved from serial performance to a parallel case, while maintaining the integrity of the model but increase its performance. He researched thread-and-pool techniques and fork-and-join. For thread-and-pool, the population is on one thread, then two agents are selected, while the evaluation of the trade and the states are handled by another thread. For fork-and-join, the agents are divided into ten sub-populations, and at each turn, agents can move to another thread.

In the end, McCabe found a widespread homogeneity of practices with little parallelization in the models on OpenABM. He also found that parallelization is tied to the agent scheduling at the implementation level and it seems that most (possibly all) ABMs can be parallelized using thread-and-pool or fork-and-join methods. However, he suggests that fork-and-join be avoided, because it may have some biases.

--

--

Jacqueline Kazil
Notes from a Computational Social Scientist

Data science, complexity, networks, rescued pups | @InnovFellows, @ThePSF, @ByteBackDC, @Pyladies, @WomenDataSci, creator of Mesa ABM lib