Google Summer of Code 2016
This summer, I was selected for the prestigious Google Summer of Code program with the Performance Co-Pilot organization. Performance Co-Pilot is an open source system level performance analysis and inference toolkit. It has an agent based architecture, where different agents (called Performance Metric Domain Agents or PMDAs) are responsible for collecting performance metrics. Each agent reports the collected metrics to a daemon, called the Performance Metric Collection Daemon, or PMCD for short. Clients wanting to access metrics can then send a request to PMCDs with the appropriate request parameters. PCP has a set of client tools, pmval, pmlogger, pminfo, pmdumptext, pminfer… and so on that help users better understand the metrics reported by the collector daemon. PCP has an open API for building agents for any piece of software you want, and some ship with the toolkit itself, like linux, influxdb, mysql, elasticsearch… and so on.
Performance Co-Pilot also supports instrumentation, i.e. user applications reporting metrics at runtime, through the Memory Mapped Values (MMV) PMDA. The PMDA basically monitors memory mapped files and reports metrics by reading them to a PMCD, so we can run PCP client tools on them just as we would run them on a giant monolithic database application like mysql. The instrumentation API currently has implementations in C, with bindings for Perl and Python, and a separate application Parfait, implements the Java API. My project was to implement an instrumentation library for PCP in golang.
Go is a new and unique programming language built at Google. Its most appealing features are tiny language spec and easy concurrency. The language has gained immense popularity for its tooling and productivity oriented design and in recent times has gained widespread adoption and a lot of popular open source projects have been implemented in golang.
This post summarizes my 4 month program period in the summer of 2016 hacking on golang and instrumentation.
Before the beginning of the official program period, Google allows students one month to get familiar with the organization they will contribute to, to get familiar with the programming practices, source code, get doubts cleared etc. PCP is a Red Hat project and is progress is essentially tracked using a wekan board that is also public at http://tasks.pcp.io:3000/b/cbEf5fxGPp8BrbGYS/roadmap. So at the beginning my mentors created a separate board for me to add tickets and track progress using them. We also used IRC using the #pcp-go channel on freenode to talk about progress. Finally, I was expected to send weekly reviews about progress achieved in that week and what I expected to achieve the following one. Needless to say this post is basically a concatenation of those reviews.
- cloned the PCP source code repository and built it locally, while being constantly in touch with my mentors on the steps I was following. A simplified result is https://gist.github.com/suyash/0def9b33890d4a99ca9dd96724e1ac84
- set up the test suite to run locally, reported failures using screenshots and logs.
- cloned parfait, the java project implementing instrumentation for PCP, and got it to build locally
- set up a parfait-examples repo to track different APIs in parfait and their usage.
- read up on instrumentation in java, especially the instrument package and the premain method, and the corresponding implementation in parfait with parfait-agent
- also tried to find examples of similar functionality in golang, came across expvar in golang with the `/var/debug` export, as well as davecheney/gmx and prometheus.
The official program period began on May 23
- set up the main repo as suyash/pcp-go, which would later go on to become performancecopilot/speed
- implemented the config reading functionality that reads variables from the local pcp installation on initialization
- I was moving this week so couldn’t work much on the project
- started work on the metrics implementation for the package
- around this time, the organization moved from IRC to slack for all communication, and a separate slack channel was created for my project
- the project was renamed to speed
- finished initial implementation of a singleton PCPMetric
- defined the InstanceDomain interface and PCPInstanceDomain type
- defined the Registry interface and the PCPRegistry type
- defined the Writer interface
- defined the PCPWriter type to write metric data to mmv files
- defined a bytebuffer type that replicated behavior similar to Java ByteBuffer for a fixed size byte array
- added a couple of basic examples on using the API
- my mentor, Nathan Scott contributed an icon for the project
- defined MemoryMappedBuffer to write data to memory mapped files
- did a full rewrite of the metrics API to separate singleton metrics from instance metrics
- finished implementation of `AddMetricByString` to add metrics using strings like in Parfait
- added MIT License to the project
- added string type support so clients can register metrics with string values
- renamed writer to client, since all the writing was being done by MemoryMappedBuffer
- added ‘Must’ like methods to the writer and metrics that panic instead of returning the error
- added vendoring for the project using govendor and added logging through sirupsen/logrus
- added automated testing using Travis CI
- added automatic coverage using coveralls
- tagged an alpha release for the library for people to try out
- implemented a go port of the mmvdump utility from PCP core to parse mmv files and test client based on the parsed output
- improved client tests using the mmvdump implementation
- around this time, PCP implemented mmv2, a new format for mmv files that allowed metric and instance names longer than previous limit of 64 characters, so added support to speed for writing the new format, while falling back to the old one when necessary
- rewrote bytebuffer as bytewriter to support concurrent writing at multiple offsets instead of a single one
- rewrote client to write all mmv data concurrently. The previous version was almost entirely based on the Java version and did everything serially. Rewrote everything for concurrency
- implemented custom metric types on top of the core PCP Singleton and Instance Metric types, defined Counter, Gauge, Timer, CounterVector and GaugeVector
The current repository for the project is hosted at https://github.com/performancecopilot/speed. Installation and usage instructions are in the README, while there are also some basic examples in the `examples` subfolder. The current list of things being worked on can be seen by looking at the issues.