Visualizing Commit Activity to Help Understand our Workload

Chris Whong
NYC Planning Tech
Published in
4 min readDec 3, 2018

Our small team here at NYC Planning Labs has worked on a wide array of projects since our launch in 2017, leaving us with a lot of code to maintain across dozens of code repositories. We are agile and tend to focus our energies in a given week on one project (or one main effort with a lot of support and maintenance tasks on other projects).

Since all of our code is on github we have access to some key data via their API that is useful for understanding & visualizing the cadence at which we work. Specifically, I’m referring to the weekly commit logs that accompany each github repository.

If you go to the “Insights” tab on any github repository, you can see a simple bar chart showing a count of weekly commits for the past year.

Github shows a simple bar chart of weekly commits for the past year under each repo’s “Insights” tab

Dig a little deeper, and you’ll see an AJAX call for the data that powers this chart: an array of objects with commit counts for each day of a week, a total, and a unix timestamp for the start of the week.

We used these data over the summer to manually grab data from our “major projects” and create a stacked bar chart of our first year’s activity.

This time around we wanted to go a step further and be able to quickly (programmatically) get these data for all of our repositories, so we can use it in reporting.

Building an API

We put together a simple express API endpoint that 1) gets all repositories in the nycplanning organization tagged “labs”, and 2) iterates over each and grabs the “commit-activity-stats” data from github. It bundles them all together into a clean response, showing the name of each repo and its associated weekly commit counts.

We use express.js to create a simple JSON endpoint that serves clean data about our repositories

You can see this new feed of data here: https://home-api.planninglabs.nyc/github/repo-activity. Naturally, the source code is open and you can see how we get the data from github and transform it.

From there, we played around with visualizing it. The first pass was a line chart that layers them all on top of each other, based on this multiline d3 example.

This first-pass visual was interesting, but hard to decipher unless you can hover each bar to highlight.

It’s a bit of a mess but does get the point across that even when we are sprinting on a main effort there’s always lots of other work happening simultaneously.

We thought sparklines might be an interesting way to visualize these lines to give a very quick glimpse of a trend without all the overhead of labels and axes. Here’s what they look like in a simple 5-column layout. The y-axes are normalized to reflect the highest weekly commit count, and all x-axes represent 52 weeks:

Sparklines (simple line charts without axes) give us a quick visual of commit activity over the past year.

You can see the code used for these sparklines in this gist.

We’re using these simple lines as an indicator of “time elapsed since we had large amounts of activity on a repo”, which isn’t actionable on its own, but contributes to the decision-making process for prioritizing future work.

It’s also interesting to see the patterns that emerge. Some projects we’ve done work on and never touched again (zap-search), some have pretty regular and consistent amounts of small work (zola), and some appear to have no major bumps (like some of our simpler backend services that were stood up quickly and did not need a lot of changes and experimentation in their development).

This was a one-day exercise to create a slide for a reporting presentation, but we took the extra time to create an API endpoint to support our future aspiration for a proper dashboard. The vision is to show issue counts, bugs, analytics and usage trends, etc. These commit history sparklines can find themselves worked into a richer set of information that helps us stay on top of our ever-growing portfolio of work.

--

--

Chris Whong
NYC Planning Tech

Urbanist, Technologist, Mapmaker. Developer Relations @Mapbox