EvilPlot — A combinator based plotting library for Scala

By David Hahn

As a data scientist at CiBO Technologies, I do a lot of data visualization (truthfully, everyone at CiBO does a lot of data visualization). Proper data visualization is key to the work we do, but our software stack is mostly Scala and the well known plotting libraries exist in the R and Python ecosystems. Since many of us knew R, we satisfied our plotting needs for a while with ggplot2. What was unsatisfying was serializing typed, structured data from Scala through CSV into R just to make a simple plot. Maintaining a large R codebase of mostly plots started to smell a bit. There had to be a better way.

Existing plotting libraries for the JVM are largely modeled after the APIs of the existing R and Python ones. We felt we could make something more powerful and extensible. As Scala programmers, the OO side of our brains valued the principle of being open for extension but closed for modification and the FP side of our brains told us that the combinator pattern was an excellent way of starting with well defined initial building blocks and then combining them into higher and higher order capabilities.

Thus EvilPlot was born. EvilPlot is an open source, combinator based, pure Scala plotting library that runs on the JVM and in the browser via ScalaJS, rendering either to Java2D or HTML 5 Canvas.

We need to do data visualization, but we have better things to do than constantly implementing some new minor variation on a plot. EvilPlot’s design is based on our belief that there are just a handful of primitive structures that exist to define 2D graphics. When paired with a concept of a plot that shares information across axes, labels and plot elements, we can express all the plots we ever want by combining this small set of graphical primitives into higher and higher order plotting capabilities and thus being open to the future without constantly messing around with the API.

Getting Started

To get going with EvilPlot, you’ll need to add it to your build.

resolvers += Resolver.bintrayRepo("cibotech", "public") libraryDependencies += "com.cibo" %% "evilplot" % "0.2.0" // Use %%% instead of %% if you're using ScalaJS

Our first plot

EvilPlot is all about building larger graphical capabilities out of smaller ones. What this means for you is that making simple plots is easy, and we don’t have to spend a lot of time going over all the features just to get started. So let’s make our first plot, a simple scatter plot with sequential x-values and random y-values.

import com.cibo.evilplot.plot._ 
import com.cibo.evilplot.plot.aesthetics.DefaultTheme._
import com.cibo.evilplot.numeric.Point
val data = Seq.tabulate(100) { i =>   
Point(i.toDouble, scala.util.Random.nextDouble())
}
ScatterPlot(data).render().write(new File("/tmp/plot.png"))

To break it down:

  • EvilPlot uses an implicit theming system, which lets you control the appearance of plots wholesale. In this case, we’re just using the built-in one, so we import it.
  • ScatterPlot returns a Plot object, which is a description of how data should be plotted (plot these points as little circles on the screen), with what components (like axes, a background etc.). In this case, we’ve used no components. All we get is points on the screen!
  • Finally, render() on Plot returns a Drawable object. Think of a Drawable as a fully specified description of a scene. render itself does not perform any side-effects, it simply constructs a scene given a plot and a size.

This plot is not very interesting, of course. We should probably add some axes, so we know the range of the data, and some labels, so our audience knows what we’re talking about. That’s easy as well:

import com.cibo.evilplot.plot._ 
import com.cibo.evilplot.plot.aesthetics.DefaultTheme._
import com.cibo.evilplot.numeric.Point

val data = Seq.tabulate(100) { i =>
Point(i.toDouble, scala.util.Random.nextDouble())
}
ScatterPlot(data)
.xAxis()
.yAxis()
.frame()
.xLabel("x")
.yLabel("y")
.render()

Adding these things is simply a matter of calling an additional method on your plot that specifies exactly what type of component you want to add.

Jump ahead a bit

Let’s take a look at a more complex example that demonstrates EvilPlot’s power. In this case, we’ll plot the distribution of the phi and psi dihedral angles in alanine dipeptide calculated from molecular dynamics simulations. We have six simulations total, run at 3 different temperatures and using 2 different parameter sets.

You can look at the example and pick out several building blocks: we’re using aContourPlot to create the contours of the data from each simulation. We also make histograms of each angle (phi and psi) as individual plots, and EvilPlot lets us add plots on the border easily. We even mark the initial point as its own plot, and use, Overlay, to compose multiple layers into a more complex plot. Finally, the whole thing is wrapped in Facets, to give us a grid of plots, with shared axes that we can understand.

Facets(   
AlanineData.allDihedrals.map(
_.map(ps =>
ContourPlot(ps,
surfaceRenderer = Some(SurfaceRenderer.contours(
Some(dodgerBlue))))
.overlay(ScatterPlot(ps.head,
pointRenderer = Some(PointRenderer.default(
Some(crimson)))))
.topPlot(Histogram(ps.map(_.x)))
.rightPlot(Histogram(ps.map(_.y)))
.frame()
)
)
)
.topLabels(AlanineData.temps.map(k => s"$k K")) .rightLabels(Seq("params1", "params2"))
.xbounds(-180, 180)
.ybounds(-180, 180)
.xAxis(tickCount = Some(6))
.yAxis(tickCount = Some(6))
.xLabel("phi")
.yLabel("psi")
.render()

Hopefully, this example convinces you that it’s easy to start with a simple visual and compose more and more complexity on top of it using plot combinators. We first saw that the base plot can be customized using PlotRenderers. After that we saw how simple Plot => Plot combinators can add additional features piece-by-piece. Next, we saw that “multilayer plots” are expressed in EvilPlot simply by applying Overlay, a function from Seq[Plot] => Plot. Finally, we saw how we can build a multivariate, faceted display out of other plots easily, regardless of how complicated the individual plots are.

This is just a bit to get you started. Take a look at the docs or code to continue your journey.

David Hahn is a data scientist at CiBO Technologies where he builds software to automate the repetitive parts of data science and make rigorous scientific work easy.