GSoC 2017: Introduction

I’ve been selected for Google Summer of Code 2017 to work with the Performance Co-Pilot (PCP) organisation. My project involves building from scratch a Memory Mapped Values (MMV) instrumentation library for Rust. I will be mentored by Josh Stone, Suyash Verma, and Ryan Doyle. This blog will document my progress each week for the next four months.

What is MMV instrumentation?

An important step in instrumentation a process is reporting certain insightful metrics related to it’s execution. These metrics (from potentially several processes) can then be collected, archived, and analysed by a separate process (usually a daemon). PCP provides a whole suite of tools to do just that.

If we were designing a method for reporting metrics to a daemon, we might initially consider using Inter-process Communication or Unix domain sockets dto pass data between two given processes. However, that would involve much overhead because of several context-switches between the user-space processes and kernel-space in-between for every call to transfer data. That would negatively impact the performance of the process being instrumented, and might render certain metrics obsolete after they’ve been reported.

However, the method PCP uses to report metrics involves much less overhead. It mandates that the instrumented process share some of it’s virtual memory address space with the daemon process using a memory-mapped file. This shared address space would contain the relevant metrics stored in a structured binary format known as MMV, whose formal spec can be found here. When the instrumented process wishes to update some its metrics, it simply writes that data to the shared address space, and the daemon process reads from it whenever it wishes. No explicit communication or synchronisation is involved. The actual “reporting” literally just comes down a bunch of MOV instructions in assembly.

Why Rust?

Currently, libraries for MMV instrumentation exist in C, Python, Java, and Go. Rust is about the same age as Go, and about as popular too. I feel it gets a lot more love simply because of what it promises and delivers: memory safety without a garbage collector, zero-cost abstractions (think functional), data-race detection at compile time, and C-like performance. Couple that with a great tooling system, and a great, great community, it’s probably safe to bet on it’s success. Plus, because it’s still new, there’s lots of opportunity to make something the “standard” in the ecosystem.

Final Note

Completing this project will benefit PCP by expanding it’s extensive framework of performance analysis tools to the Rust ecosystem, will provide the Rust ecosystem an efficient library for instrumenting applications with production-ready tools. And it’ll also provide me with a great summer filled with tons of coding and learning (and some sweet cash too :P).