Rust & Docker in production @ Coursera
By Brennan Saeta
Building a platform for quality education at scale is much more challenging than it initially appears. One of the most sophisticated components of Coursera’s learning platform is our programming assignments infrastructure. We efficiently, reliably, and securely grade assignment submissions inside hardened Docker containers. Although we offload the cluster scheduling to Amazon EC2 Container Service (ECS), to orchestrate all of the moving pieces we have a number of additional programs that work in concert, some of them written in Rust. To understand why Rust is the best choice, we first need to understand what happens when a submission is uploaded for grading.
After storing the submission in Amazon S3, we invoke Amazon ECS to schedule the instructor-uploaded grading container corresponding to the assignment on our cluster. Before executing the grading script within the grading container, we must wait until a co-process downloads the submission from Amazon S3 and maps it into the container’s filesystem.
To do this, we override the container image’s ENTRYPOINT with our own custom binary that invokes the original entrypoint after the submission is available. Additionally, this binary performs a number of low-level security functions. For security reasons, this program must only depend upon the kernel’s Application Binary Interface (ABI) because this program executes within the (untrusted, arbitrary) grading container. For more background on some attacks and defenses, see http://betacs.pro/blog/2016/07/07/docker-and-rust/. We therefore cannot use a high-level language such as Scala or Python as we cannot assume a JVM or Python interpreter is available and vulnerability free.
Although C is the default low-level full-control programming language, these binaries have strict security and correctness requirements. We instead have chosen Rust, a modern native language from Mozilla. One of Rust’s common selling points is complete immunity to certain classes of security vulnerabilities thanks to its powerful type system, making it an excellent choice for security critical functions. In addition to trivial interop with C libraries and APIs, Mozilla has invested in the whole ecosystem. Cargo, the build tool, makes it very easy to consume open source libraries, as well as build, test, and release binaries. By default, Rust binaries are dynamically linked against glibc for the platform. Fortunately rustup.rs makes it trivial to cross-compile to the x86_64-unknown-linux-musl target triple which statically links the resulting programs. Combining the powerful type system, Cargo and rustup.rs makes the Rust ecosystem one of the fastest, and most maintainable ways to build the utilities we needed. The JVM and the Java/Scala platform remains one of the best ways to write applications, but there are some contexts where it is inappropriate. Although Scala remains our exclusive language for our online serving, Rust is here to stay powering small but critical components of our programming assignments infrastructure.
Originally published at building.coursera.org on July 7, 2016.