I’m Tiffany, a Product Engineer at Mixpanel, and have been an engineer here for a year and a half. I’ve been tracking various time and Github data for fun during my time here and I thought I’d share my results to start a conversation and be completely transparent about how engineers spend their time, and how some aspects of their code changes over time. For the methodology and other graphs, view the original version of this piece.
Along the way, I wanted to ensure I was on the right track by logging my progress. It’s not an easy thing to quantify, but it’s something I think is necessary to concretely see improvement over time. I could gather time spent data and statistics from Github pull requests (PRs), and with that hopefully discover a personal “north star” metric that I would aim for every day towards engineering betterment.
Disclaimer: this is a purely data-driven perspective on progress. Aside from code quality and process, there are many factors that contribute to the success of an engineer, such as the way she solves problems, works with other engineers and project stakeholders, and takes feedback, and I wish I knew how to track or quantify those aspects. Not letting perfect be the enemy of good, I chose to track time spent and Github statistics because it was quantitative and it’s not something that has been documented before.
I understand that these are surface level metrics, but I hope the data on my own progress sparks a deeper discussion on what a “better” engineer means and what sort of data we can use to prove that and make it mainstream.
To become a better engineer
There are many facets to progress and being a better engineer, and at Mixpanel, we highlight technical ability, technical leadership, and teamwork, which are integrated into our leveling guide. For the sake of quantification (and since I’m still early in my career), I focus this analysis loosely on technical ability.
In terms of technical ability and code quality, the goal is to get a LGTM (“looks good to me”) on a significant PR and an approval to ship from a fellow colleague, or even better, a senior engineer. This single phrase shows that they approve of your process and code.
In an attempt to quantify my personal progress towards this goal, we’ll look at four of these points in the data I collected:
- Number of comments: How many comments do I get on average?
- Number of PRs and their cadence: How often do I deploy code to our codebase?
- Pre-review and post-review time: “Pre-review” is the period of time in which I spend on beginning research and coding before asking others to review, while “post-review” is that which I spend incorporating comments and feedback from reviewers before shipping code to production. Deploy time is not added to the post-review unless I needed to revise something that came up during staging. How much time do I spend in these areas?
- Unique lines of code: How much code do I contribute? (This is by no means the way we evaluate or measure engineers at Mixpanel, but it was a piece of easy to get quantitative data that might be interesting to look at)
- Average working time: How long do I spend coding? (you check out the results in the extended data)
Ultimately, I’m looking to determine whether I can use correlations between my time spent at Mixpanel and these statistics to increase my chances of getting a LGTM more quickly.
All the information displayed below are from pull requests, and the time spent is recorded in Google Calendar. I then piped the data into Mixpanel. You can read more specifically on my method in College Productivity Analysis or in the extended version of this piece with slightly different data.
What I found
In this section, I pin a graph with a finding. The following data comes from 275 workdays (14 months), 182 PRs, 8,305 lines of code added, and 1,271 comments. For those who care about the 10,000 hour rule, up to this point I have 3,961 hours of engineering practice.
Please note that while I took basic statistics in high school and in college, I do not have a background in data science. Thus, my analysis is limited to the lens of linear correlations and timescales, as well as some handy graphs Mixpanel has provided. For another lens to view the data (there are correlations, more graphs, and time groupings with vacation filtered out), check out the extended version of this piece.
For each graph, the three speech bubbles above the dates (annotations) represents the start of a new project. The legend displays what each line represents from January 1, 2018 to March 31, 2019. Note that no data is filtered out, so some of the dips in the data are from PTO or vacation time.
Max vs. Average Number of Comments
Here, we see the number of average comments and maximum comments per PR by month decrease over time. In general, with each new project there’s an increase of comments, and then a decrease.
Total PRs Started, Reviewed, Deployed, and Closed
Here, we see the total number of PRs started, reviewed, deployed, and closed (not deployed) by month. Over time, we see that more PRs are pushed through over time, and PRs are going through faster. Luckily there aren’t too many PRs that were closed without being pushed to production. As a general trend, it looks like a few PRs were pushed in the beginning of a project before increasing in frequency.
Pre-review vs. Post-review ratio
As a note, sometimes pre-review time also includes time redoing a PR when a reviewer lets me know that I either created the wrong feature or have to change the overall approach to account for an extra detail. In this way, the wording of this section is a little misleading-my primary goal for this ratio is to determine how much time it takes to refactor or address the comments of others on coding style or existing behavior, and if I started on the wrong foot or did not include something, I don’t feel that’s completely accurate in that representation. However, I do admit that moving some PRs’ post-review time into the pre-review time fudges the numbers a bit, and thus I advise to take this graph with a grain of salt.
Here, we see the percent of the total amount of time spent on the PR as the “pre-review”. In general, this number increases over time. There’s a slight dip after the beginning of each project, but then increases again.
Average pre-review, post-review, and total time spent per PR
Here, we see a graph of the median time by week rolling 5 weeks of the pre-review, post-review, and total time spent. Aside from the deep dips in data associated with vacations, in general there’s a higher percent of time spent on pre-review than post-review, and both numbers are gradually decreasing over time. There’s a drastic jump in time spent after each project, but generally declines from there.
Total time comparing first and second projects
Here is a previous time comparison of the total hours spent per PR between the first project (the dotted line) and second project (the solid line). The first project started off slow before hitting a peak, then decreased. The second project was much more consistent over time.
Average lines of code per PR
Note: lines of code are not the way we measure or evaluate our engineers, nor are they by any means a good measure of progress, but it is fun to look at and see changes over time.
Here, we see a graph of the average number of unique lines of code added per PR. On average, the number of lines decreases, and in general the start of a project leads to increase of lines of code before decreasing again. At Mixpanel, we encourage engineers to create smaller PRs and ship with higher cadence, hence the gradual increase in PRs above and a gradual decrease in lines of code. This way, we can test and deploy within 30 minutes. Having a consistent stream of small wins that make things better for customers is a great motivator.
Total time spent per total unique lines of code
Here, we see a graph of the total number of hours per month per PR divided by the total unique lines of code added per month per PR over time. It looks like in general this ratio is decreasing, which suggests that less time is spent per line of code over time. This could suggest that I have become more efficient by gaining deeper context on the project but it could also suggest that I’ve added more tests, documentation, or fixtures to increase more lines of code without necessarily requiring more time for review.
Total comments / lines of code and Total hours / lines of code
While Mixpanel doesn’t have scatterplots, here’s a close approximation of the correlation between comments and total hours by comparing the ratio of comments to unique lines of code and total hours to lines of code by month. Both ratios are decreasing over time, which suggests fewer comments and hours over time. But this also suggests that the number of total hours are tightly coupled with comments, which makes sense given my inclusion of the post-review, which is essentially the comments I get per PR.
Did I improve? Probably. It’s clear that there are fewer comments, more PRs deployed, and a higher ratio of pre-review to post-review in general and within projects, which does appear to be closer to the LGTM state. In addition, the difficulty of tasks most likely has increased given my level increase. There were some fluctuations in accordance with the first half of a project, which suggests…
Starting projects at first will look less productive than usual. When I start a project, within that month I average more comments per PR, less time spent, a lower pre-review ratio, and less code added to production. Perhaps this is because I’m researching or still figuring things out, which leads to PRs that have more comments and more non-PR work. However, over time when I’m in the zone or know what’s going on, these numbers all trend towards the positive.
“Simple PRs” don’t necessarily mean “small PRs”. When we look at the raw numbers rather than the averages by month from the above, the correlations between these amount of comments, lines of code, and hours spent per PR are weak to moderate at best, which suggests that it’s hard to define quantitatively what it means for a PR to be “simple”. However, there’s a higher chance of simplicity if we break PRs up into smaller chunks, which allows reviewers to understand a larger whole piece by piece and catch things that might have otherwise be missed on a larger PR. This is a common practice at Mixpanel and something that we encourage all of our engineers to do.
I’ve learned a lot about software engineering over the past year, and there’s always more to learn. I definitely have a long way to go before I can create a significant PR with LGTM, but for now I’m pleased with the fact that there is progress and the numbers back it up.
One of my goals for this project was to find a personal “north star metric”, but even after all this data I wasn’t able to find a single source as there are so many factors that contribute to progress. For now, I will continue to track these numbers and distill them in a way that can hopefully lead to more insights. In addition, I plan to track time estimates, difficulty, and break down my original pre-review statistic so that I can have more accurate data (more information about this in my original post). I also hope to leverage Github statistics to make useful insights across all of our engineers.
In the meantime, my action items to increase PR quality will be to:
- Continue to break up larger features into smaller PRs to decrease the complexity of the overall whole
- Ask for feedback early and often to ensure that the methodology is correct
Like reading stuff like this? Check out another piece by technical lead manager Aniruddha who analyzed our engineering Github commit history. Feel free to leave a comment or connect on LinkedIn if you have any questions or comments or advice on what to track. We’re always hiring engineers and support engineers (if you aren’t ready for engineering quite yet)! I hope this article has inspired someone out there to track their data and draw more conclusions that can help the engineering community.
Originally published at https://engineering.mixpanel.com on July 10, 2019.