Software Development Metrics

Part of The PIRATE Way — Stories about scaling up engineering teams.

Published in

The PIRATE Way

5 min readSep 18, 2023

Background

My exploration into engineering metrics began when Isra (Israel Saeta) introduced pull reminders (Pull Panda) at TravelPerk back in 2019. Although Pull Panda ceased its independent service following its acquisition by GitHub, it made an indelible mark by not only reminding about open PRs but also pioneering a set of engineering metrics with behavioral analytics and rankings.

️ Throughout this article and future ones, it’s worth noting that Abi Noda, Pull Panda’s CEO and founder, now at DX, is a significant figure in the realm of developer productivity, and following his insights could be invaluable.

While enthusiastic about these metrics, our approach was prudent. We set a foundational guideline:

Engineering metrics should never be the sole measure of team or individual performance. They are tools to stimulate discussion and highlight potential areas of focus, but they must be used judiciously.

We strived to avoid the over-justification effect, emphasizing that our primary aim was learning from these metrics, refining or discarding them as necessary based on observed behaviors.

We gained valuable insights into direct contribution patterns and collaboration by analyzing PR history from GitHub’s API. We then made this data transparent to all team members.

The Lifecycle of a Pull Request

A Pull Request (PR) or Merge Request (MR) signifies more than just a code change in software development. It represents a journey that starts with a developer’s task and ends with the integration of new facets into the existing codebase. This journey, with its multiple phases — from initial proposal and code reviews to iterations and eventual merging — offers unique opportunities to extract metrics that can provide deep insights.

These metrics, which emerge at different stages of a PR’s lifecycle, give a holistic view of the development process’s efficiency, collaboration dynamics, individual inputs, and potential pain points. They become invaluable tools that empower stakeholders to derive actionable strategies for enhanced productivity and collaboration.

Below is an illustrative representation of the PR (or MR) lifecycle.

The subsequent sections categorize and detail the various metrics that can be evaluated.

Engagement Metrics

Time to response

Measuring the gap between a PR’s readiness for review and the receipt of initial feedback can provide insights into team dynamics and potential areas of improved collaboration.

Ownership Metrics

Time to merge

Once approved, a PR should merge promptly, albeit with some exceptions, like avoiding pre-weekend merges or just before the end of the journey. I always advise you to avoid deploying something if you don’t plan to be monitoring and available for an intervention for the next 2/3 hours. A typical benchmark is a 24-hour window on workdays.

Collaboration Metrics

Delivered reviews

Tracking the number of PR review comments and change requests can illuminate an individual’s engagement level in the Software Development Lifecycle (SDLC).

PRs Size

Promoting concise PRs can streamline reviews, reduce merge conflicts, and expedite integration.

Maturity Metrics

Time to approval

Pushing for rapid approvals can jeopardize code quality, so I wouldn’t advise setting an SLA around this metric. However, consistently extended approval times (higher than the usual team’s standards) might signal issues with the PR’s quality or potential internal team disagreements.

PR comments and change requests received

A high frequency of comments or change requests received (again, (higher than the usual team’s standards) can indicate either a PR’s immaturity or a divergence in team members’ perspectives.

Throughput Metrics

Quantity-driven metrics like volume of PRs or Lines of Code (LOC) are often superficial. Their true value lies in evaluating team dynamics, patterns, or architectural bottlenecks. Focusing solely on PR numbers or LOC doesn’t yield genuine throughput insights without understanding task complexity.

Impact Metrics

For aspiring IC leaders, metrics highlighting their contributions across different repositories, technologies, or PR reviews from other teams can be enlightening. Active participation in PR reviews, especially outside one’s immediate team, can spotlight individuals emerging as technical references.

Bad-practices Metrics

Self-approved PRs

Continual monitoring can ensure no misuse of administrative rights.

Big PRs

Consistently bulky PRs call for discussions to unearth the reasons.

Working out of working hours

As important as providing flexibility, especially when your time is in multiple time zones, it is crucial to uphold work-life balance.

Metrics Segmentations

You can derive nuanced insights by segmenting metrics based on factors like Team, Team Level, Discipline, Role, or Tenure within the company/team.

Remember, while these metrics are instrumental, they’re best seen as conversation starters. The intricate nuances of software development often elude strict quantification, emphasizing human judgment’s pivotal role.

Findings

Software Development Lifecycle (SDLC) SLA

From the plethora of metrics available, we chose to establish Service Level Agreements (SLAs) around:

Time to Review: % of PRs reviewed within 24 hours
Time to Merge: % of approved PRs merged within 24 hours
PR Size: % of PRs with fewer than 500 lines of Code

Introducing SLAs around these metrics spurred the desired behaviors and allowed us to pinpoint teams requiring specific support, be it resource allocation, domain expertise, or other factors.

Spot on Architectural limitations

Metrics, when assessed with care, can spotlight architectural challenges. As highlighted in Clean Architecture (by Robert C. Martin), an optimized architecture minimizes cognitive overhead, enabling smoother contributions.

Tools for Growth and Development

Metrics can cater to different roles:

Engineering Managers can leverage them for career discussions, supporting newcomers, and spotting potential burnout signs.
Team Leads can harness these metrics to foster team alignment, ensuring SLA adherence and enhanced collaboration and incorporating them into retrospective dialogues.
Individual Contributors can use metrics for introspection, identifying growth opportunities, and framing discussions about their input.

Conclusion

Judiciously used metrics can foster positive behavioral shifts in software development. However, it’s vital to recognize that they offer a mere snapshot of the intricate interplay of team dynamics, individual contributions, and project challenges. They should serve as a foundation for reflection, growth, and evolution, with human insight always taking precedence over raw data.

Remember: This is a blog post from the series “The PIRATE Way”.