What makes ScalaFiddle so fast

Otto Chrons
6 min readOct 23, 2017

--

ScalaFiddle is an online service for creating, sharing, compiling and running Scala code snippets. It provides a web based editor for writing Scala code and a backend service for compiling it into JavaScript, which is then run in your browser. And it works really fast. Faster (typically) than your local Scala development environment. But how?

The local alternative

For regular Scala development developers typically use the sbt build tool. People new to Scala might start using sbt in a non-interactive mode, running it from the shell with sbt compile and sbt run commands. This is highly inefficient as sbt needs to be initialized over and over again, JVM has no chance to warm up and nothing is cached in memory. For more efficient usage it’s highly recommended to start an interactive sbt session and run the commands within session.

To simulate a “ScalaFiddle”-like experience, let’s setup a simple one-file Scala project with just Main.scala containing something like,

object Main extends App {
val r = 0 to 14
println(s"Hello world: $r")
}

Then we start sbt with a timing option to see how much time is spent and where: sbt -Dsbt.task.timings=true We also want to use same Scala version as ScalaFiddle does, so we execute a ++2.12.3 command to switch to that version.

When running our small test app with run we get a total time of 3765ms, which is sort of expected for the first run with a cold JVM and nothing cached. So let’s make some small changes to the code to make sbt recompile and try run again. This time we got a result in 701ms, a great improvement! Doing a dozen more iterations the time eventually goes down to 400ms out of which about 325ms is spent in compileIncremental. (Note: all the timings in this article are a result of single measurements, not meant to be super accurate, but they give you the rough idea)

Let’s create a fiddle with the same code and see how it fares.

On the first run it spent 401ms sending the code to the server and waiting for the response and another 55ms downloading the compiled JavaScript code (measured by the Chrome network inspector). A few iterations gets that down to 340ms, which is already faster than what we got locally after several iterations! At this point it’s good to note that ScalaFiddle does more than sbt since it uses Scala.js to compile the code to JavaScript instead of JVM bytecode. Also, that 340ms is what the user experiences, it’s not the true compilation time since there’s network transfers and such to consider. If we peek at ScalaFiddle logs, we can see that the full compilation took only 108ms out of which only 36ms was spent in actual Scala compilation and the rest in different Scala.js phases. What makes this possible?

Efficiency and performance

ScalaFiddle’s speed is a result of both efficiency (not doing things you don’t need to do) and performance (doing things that you have to do, but faster).

DRY — Don’t Repeat Yourself (but I just did!)

As ScalaFiddle is primarily designed for embedding fiddles into web pages (like we did in the previous section), by far the most common use case is that users compile and run exactly the same code multiple times. To simulate this we can leave the code in the editor unchanged and click Run again (after disabling browser caching). We get a result back in just 76ms (of which 48ms is spent transferring the JavaScript result) using my 4G connection.

Looking at HTTP response headers we can see that it hit the cache in CloudFlare and is valid until Jan 21st, 2018. What this means is that the request never even went to the ScalaFiddle server, but was instead served by CloudFlare’s edge location in Helsinki. If it hadn’t hit the cache in CloudFlare, it would’ve hit a cache in ScalaFiddle’s router service, avoiding the unnecessary recompilation in either case.

The fastest compilation is the one that doesn’t happen.

But there is a complication. CloudFlare cache works only for GET requests, not POST, so how do we send the source code to the server. Well, we compress and base64 encode it, and then add it as a query parameter, of course!

source=H4sIAAEAAAAA_02Ov2vDMBCF9_srDuPBhSLXHjIUAlkcQsnm0F2xVJC56ox0OIGQ_93yj8HTx3t8vDv3P3AQ_HPGkFXnBZ-4Ug3BeSEPmxQ7TbqPqo8Ap0TtPYsWx179tM1zdm48XO1oqcjaWV53sg_ge287wV2JL0AsS8zX2IoOkppREwY84hcKY3VIzfZEEbOLJWJ8cCDzjXlIs_uFxht4w5yXK782xPQZ1qqqYQI3v75f5gAAAA%3D%3D

This also explains why very large fiddles don’t work as it hits the limit of the query string size.

A nice benefit of using a CDN like CloudFlare is that you get global coverage for free. While the ScalaFiddle server residing in Central Europe might work pretty fast for Europeans and North Americans, it’s actually quite far away for people in Japan, Australia, Chile or China. By caching the results (and using CloudFlare’s network in general) we greatly improve the user experience for these distant users.

Cache everything

After the (assumed) most common use case of compiling exactly the same code, the second most common case is compiling almost the same code. Users tend to iterate over their code over and over again. Creating a compiler instance is costly, so it makes sense to reuse instances when the environment stays the same (basically the list of dependencies).

Internally ScalaFiddle service is split into a Router and several Compilers. The Router receives all compilation requests and selects the optimal available compiler based on Scala version, library dependencies and which compiler was most recently active. When multiple users are using ScalaFiddle concurrently, requests go to different compilers but typically always to the same compiler for any single user. This way the user gets access to a cached compiler that is ready to go without loading any additional libraries or doing other preparation steps. The net result is that 36ms compilation time. And they said Scala was slow to compile!

Caching is also applied to the Scala.js linker allowing it to perform incremental linking at high speeds.

JARs are great — for storing strawberry jam

So what happens when there isn’t a suitable cached compiler instance available due to a conflict in required libraries? Normally the Scala compiler and Scala.js linker would load the required class and sjsir files from library JARs but that’s quite slow, so we want to improve its performance. If you were to use an sbt based service like Scastie, it would take a small eternity (10–20 seconds) to load the dependencies into the project before even doing any compilation.

In ScalaFiddle we limit the libraries you can use so we know beforehand all the possible libraries. When a compiler instance is started, it first checks if any new libraries have been added to the selection and proceeds to download those to a local Coursier cache. It then proceeds to extract all .class and .sjsir files from these JARs, compresses them with Snappy and appends them to custom aFlatFileSystem. Basically this is just a large single file (1.4GB currently) containing all the class and sjsir files from all the libraries supported by all the Scala versions in ScalaFiddle.

ScalaFiddle then memory maps this file so that multiple compiler processes can share the same resource without using excessive amount of heap. It then builds an internal tree of AbstractFlatFiles which is then provided to the compiler (and Scala.js linker). When the compiler/linker requests a file, it’s uncompressed directly from the memory mapped file using Snappy (quite often decompressing with Snappy is faster than copying the original uncompressed bytes!). The OS will automatically handle the caching of the contents of this file.

For example adding some random libraries to our test fiddle results in about 850ms compilation time for the first iteration (200ms spent in Scala compiler, and 650ms in Scala.js as it cannot perform incremental linking). A second iteration gets us back to the 250ms range, mainly thanks to incremental linking kicking in.

On average, ScalaFiddle compilation times are between 100 and 300ms.

It’s not the size of your hardware, it’s how you use it

All ScalaFiddle services including the Editor, Router, six Compiler servers and a Postgres database run on a single bare-metal Linux server in a data-center in Central Europe. The server has a single Intel Xeon E3–1271V3 processor (4 cores, 8 hyperthreads running at 3.6–4.0GHz), 32GB of memory and 2x 2TB hard-drives (no SSD), so it’s nothing special really. Costs about 32 euros per month.

The speed of ScalaFiddle is not a result of throwing money at fast hardware resources but all about identifying the most typical use cases and smartly optimizing specifically for those.

Conclusion

ScalaFiddle provides a great way to experiment with Scala code and common libraries. If you choose to embed some fiddles into your library documentation, blog posts or tweets, you can be sure that they will perform very well no matter how many thousands of people access the content. ScalaFiddle is web-scale 😏

--

--