a paper about simpegEM

Published in

simpeg

9 min readAug 16, 2017

Recently, the paper A framework for simulation and inversion in electromagnetics by myself, Rowan Cockett, SEOGI KANG, Gudni K. Rosenkjaer, and Douglas W. Oldenburg, was published. It was a long time coming (200+ days in review…)! In it we summarize our thinking around much of the organization of the implementation of electromagnetics in SimPEG, which extends to general thoughts about forward simulations in SimPEG, and demonstrate a few examples. Here, I want to take the chance to tell a bit more of the story.

Built in SimPEG

Context is important: we are building within and upon the SimPEG framework, which we attempt to describe in a few paragraphs and a picture… but there is a whole other paper on SimPEG that you can check out!

This matters because?

The physics of EM is already complicated and often unintuitive. Having a black box that solves Maxwell’s equations only compounds the challenge: Do you understand the results? Are you sure? Is it a bug in the software? a bug in your understanding? …??? If you don’t have hooks to the various pieces (e.g. looking at fields and fluxes) and the code is tied up in an executable, then there really isn’t much hope in addressing those questions. Having code be openly licensed is a first step, but it must go beyond that, the code must communicate the ideas; it is perhaps the most rigorous documentation you have on an idea. Investing in the organization and clarity of the code is equivalent, at least in my mind, to investing in the clarity and explanation of an idea.

Being able to tinker and to play is important, especially as a scientist. When ideas are presented in scientific literature, they are often laid out linearly in a logical, step-by-step manner from question to solution. It is not that this is untrue, the story laid out in a publication is often the synthesis, the concise description that gets you to one solution. It doesn’t, however, capture the full discovery process that includes false starts, rabbit holes, and failed attempts. Each of those “mis”-steps contributes to your understanding of the problem. Such exploration requires flexibility to play within the code, interrogate each piece, assemble it in whatever way suits your needs, and eventually extend it to achieve things it previously could not.

And what did we actually do?

Mostly, we organized some things. Maxwell’s equations have been discretized and solved before. That is not new. Our goal is to organize the pieces and in doing so allow people to plug-and-play with the building blocks for EM simulations and inversions. To do so require the machinery to:

work in time domain or frequency domain
simulate with variable electrical conductivity and magnetic permeability (and perhaps include anisotropy in either of these)
use a voxel-based description of the model or a parametric one
have loop sources, grounded sources, natural sources (eg. as in the Magnetotelluric problem)
work with different formulations of Maxwell’s equations (eg. solving for electric field, current density, magnetic field or magnetic flux density)
multiple mesh types (cylindrical, tensor, curvi, octree …)

The point of much of the paper is to explain the two diagrams below. We break down the forward simulation into two overarching pieces: the problem, which contains the machinery to work with and solve the governing equations, and the survey, which contains the geometry and type of sources and receivers. Those are then broken down a bit further. Each piece then has a derivative and the derivatives are stitched together to compute the sensitivity times a vector and the adjoint times a vector (these are needed for implementing gradient based optimization techniques).

Didn’t make the paper

Though we don’t discuss these in this paper, there are a few other, perhaps subtle ideas that we have spent a significant amount of time on. One of my favourites is summed up in the figure below. Other than the addition of math in the figure on the left, there are 4 things of significance that are different between the two images.

There are two differences in how the forward simulation framework is explained and two differences that fundamentally changed the structure of the code.

The most visible is the change from the model and predicted data from boxes to circles. This is a clarification in how we are explaining each of the pieces. The model and the predicted data are simply vectors, while the boxed items are classes, which have methods and properties within SimPEG EM. Changing how they are represented clarifies how we explain the concepts. Though important for communicating the framework, these change did not require anything to be changed in the code base. There are two other, much more subtle differences that required significant refactoring and investment…

Do you see it? no? Look at the length the arrow from the model. In the figure on the left the arrow goes from the model to the physical properties. The model must be a physical property — electrical conductivity or magnetic permeability. What if we want to invert for flight height in an airborne survey? This was a fundamental change that took months of work (all to move an arrow on a diagram up a half a centimeter!). Previously, we had hard-coded the assumption that derivatives are all with respect to electrical conductivity. In order to have the flexibility to choose any reasonable parameter to invert for (what is your model?), we need derivatives on everything and we need wires in place to keep track of which derivatives are associated with each parameter considered in an inversion.

In a similar vein, but not requiring quite the same level of effort, was changing the length of the arrow from sources to the physics. In order to calculate fields, the fields need access to the sources, and it was much cleaner in the code to update things so that the fields have access to the sources directly.

I particularly like the example of moving the arrow up for the model. From the outside, it is an unnoticeable change in a diagram, too nuanced to dive into in the paper, but fundamentally, it changed a lot of our thinking about how derivatives are organized in SimPEG.

Did make the paper — spark matrices.

You may have heard of sparklines or seen Galileo’s sketches of Saturn embedded with his text if you have looked through books by Edward Tufte. They are inline data graphics. Rowan Cockett suggested we do something similar for explaining the structure of the matrices that we are working with to solve the EM system in frequency or time. They were published :)

Examples

In the original submission, we had included two synthetic examples. Two of the three reviewers asked for a field example to be included. Initially, I was a bit reluctant because of the extra length (it was already a 30+ page paper!), however, we chatted about it and decided that including a field example would increase the relevance of the paper for practicing geoscientists. So, now there are 3, and all are included in the SimPEG docs.

Example 1: 1D Inversions

The first example is rather simple. Mainly it is an example to show that we can invert some things and that the code is not overly complicated.

In fact, the code to generate synthetic data and run inversions in both the time and frequency domains is ~100 lines of code and the code to create the figure for publication is ~80 lines.

First example: 1D Inversions of synthetic frequency and time domain data. Available at: https://figshare.com/articles/Heagy_et_al_2017_-_1D_FDEM_and_TDEM_Inversions/5035175

Example 2: Bookpurnong

The second example was added after the first round of reviews. The reviewers asked for a field example, and these data have been inverted and published in a few different places (by Andrea Viezolli, and by Dikun Yang). We demonstrate an inversion of a both a TDEM and an FDEM sounding and then proceed to do a stitched 1D inversion of the FDEM data. Thanks to CSIRO, we were given permission to distribute the data, so this example can also be reproduced.

Second example in the paper. All code is available on figshare: https://figshare.com/articles/Heagy_et_al_2017_Inversion_of_RESOLVE_and_SkyTEM_data_at_Bookpurnong/5107711

Example 3: Casing

This was a fun example. The point here is to demonstrate the modularity of the code. Working with steel cased wells provides a number of end-member cases that have shaped some of the design choices and structure of simpegEM. In this example, we need to include both variable electrical conductivity and magnetic permeability, multiple meshes (a cylindrical mesh to handle the casing and a 3D mesh to handle the 3D geology), a source that is another problem (we use a primary-secondary approach, simulating the casing + the cylindrically symmetric part as the primary and then interpolating that to the 3D mesh to then solve the full 3D problem), and use a parametric description of the physical property model to (1) perform the simulation and calculate fields and (2) compute the sensitivity.

Final example. Code available at: https://figshare.com/articles/Heagy_et_al_2017_-_EM_sensitivity_calculation_for_a_parametric_model_including_steel_casing_and_using_a_primary_secondary_approach/5036123

Interesting notes…

#breaksbackwardscompatibility

Between our initial submission and the publication, a number of developments have been made in SimPEG. Perhaps most notably, we replaced the PropMap in SimPEG. Its purpose was to enable to enable developers to work with whichever physical makes the most sense for the problem at hand (eg. conductivity or resistivity) while allowing the user to provide whatever they would prefer to work with. For example, I would like to state values as resistivity in Ohm-m, but if working with E-B formulation of Maxwell’s equations, the natural physical property to work with is electrical conductivities in S/m… The PropMap takes the reciprocal and keeps track of the relevant derivatives in the inversion. The idea serves the right purpose, but our first implementation had some fairly rigid assumptions baked in which made it challenging to do things like over-write defaults or extend the inversion implementation to invert for parameters such as source height. Since then, we have refactored the idea using properties so that the behaviour is more explicit.

Additionally, the namespace for problems has since changed. We originally were working only with 3D problems, but during the development of materials for the DISC course, we needed to implement 2.5D DC resistivity which uses spatial Fourier transforms to compute the fields and as such needs its own type of Problem and Fields objects… so we needed a namespace that includes dimensionality.

Why discuss the things that are wrong with the paper? Well, wrong is not quite the correct descriptor. These ideas had not (and maybe have not) reached their maturity. These are examples of a step in the right direction, but ideas that require iteration. It is only once you try to extend and build upon an idea that you see where it is rigid, limited, and in need of refactoring. Breakthroughs require that you break some things.

Let us know what you think!

We would love to hear your thoughts, and if you are interested in using or contributing to SimPEG, join the discussions!

*Images and screen shots of the paper are CC-BY-4.0 and originally from:

Lindsey J. Heagy, Rowan Cockett, Seogi Kang, Gudni K. Rosenkjaer, Douglas W. Oldenburg, A framework for simulation and inversion in electromagnetics, Computers & Geosciences, Volume 107, 2017, Pages 1–19, ISSN 0098–3004, http://dx.doi.org/10.1016/j.cageo.2017.06.018.