Beyond Metrics: Should You Use GitHub Copilot Metrics to Gauge Developer Productivity?

Tajinder Singh
7 min readJun 16, 2024

--

AI tools have boomed in the last few months, and more and more organizations are adopting them to help their employees with day-to-day tasks, boost productivity, and stay ahead of the competition. One of the most impactful AI tools of the past year has been GitHub Copilot. Almost every developer I know is using it and loving it. Hundreds of organizations have adopted GitHub Copilot, and the overall feedback has been positive.

GitHub has recently released a metrics API that provides aggregated usage metrics for Copilot completions and Copilot Chat in the IDE for all users. You can find out acceptance rates, the number of lines of code accepted, chat acceptance rates, and other information from this REST endpoint.

Overview of Important Metrics from GitHub Copilot Metrics

The GitHub Copilot Metrics Schema provides comprehensive data on the usage of copilot by the developers. Here are the 3 key metrics :

  1. Acceptance Rate:
  • Total Suggestions Count:total_suggestions_count: The total number of Copilot code completion suggestions shown to users.Helps gauge overall usage and engagement with Copilot.
  • Total Acceptances Count: total_acceptances_count:The total number of Copilot code completion suggestions accepted by users.Indicates how many suggestions were deemed useful and accepted by developers.
acceptance_rate = (total_acceptances_count / total_suggestions_count) * 100

2. Number of Lines of Code:

  • Total Lines Suggested: total_lines_suggested :The total number of lines of code completions suggested by Copilot.Provides insight into how much code Copilot is generating for users.
  • Total Lines Accepted:total_lines_accepted:The total number of lines of code completions accepted by users.Highlights the volume of Copilot-generated code that is incorporated into actual development projects.Helps in understanding the efficiency and adoption rate of AI-generated code.
lines_of_code_utilization_rate = (total_lines_accepted / total_lines_suggested) * 100

3. Total Active Users: The total number of users who were shown Copilot code completion suggestions during the day specified.Provides an idea of Copilot’s reach and usage within the development team.

The Limitations of These Metrics in Measuring Developer Productivity

Metrics like acceptance rates and lines of code can provide some insights but often fail to capture the full picture of a developer’s contribution and effectiveness. Let’s explore why these metrics fall short and why understanding developer productivity requires a deeper, more nuanced approach.

Acceptance rate:

This metric represents the ratio of accepted lines to the total lines suggested by GitHub Copilot. This data point is good for understanding the tool’s usefulness, but different developers work in different ways, which can significantly impact the data.

  • Developer X likes to accept everything from Copilot and, once he has everything in the editor, he refactors. His acceptance rate will be 100%.
  • Developer Y is quite selective and doesn’t accept everything Copilot suggests. His acceptance rate will be quite low.
  • Developer Z already has a solution but still prompts Copilot out of curiosity to see what the AI suggests. He discards the suggestion, resulting in a low acceptance rate.

Clearly, these acceptance rates reflect individual working styles rather than true productivity. To measure real developer productivity, we need to look beyond acceptance rates and consider the broader context of their work and contributions.

Number of lines of code:

That’s a pretty straightforward one. As the name suggests, it’s the total lines of code accepted by users. As a developer, I hate this metric. People in this world still measure or want to measure developer productivity based on the number of lines of code. Evil… pure evil.

When I first started programming, my code was verbose, filled with repetition, and lacked proper structure. This resulted in a high number of lines, but the quality and efficiency were poor.

As I gained experience, I learned to write more efficient, modular code. I embraced principles like DRY (Don’t Repeat Yourself) and SOLID, which led to fewer lines of code but much higher quality. Despite the reduction in code length, my productivity and the impact of my work increased significantly. This is because shorter, well-structured code is easier to maintain, understand, and extend.

Moreover, different programming languages inherently produce different amounts of code for the same functionality. For instance, a Java or C# program will typically be longer than an equivalent Python program. Thus, comparing productivity across languages based on lines of code is inherently unfair and misleading.

Consider another scenario: if I’m new to Python and using a tool like GitHub Copilot, it helps me generate efficient code snippets quickly. This significantly boosts my productivity, as I spend less time searching for syntax and documentation. However, if my productivity were measured solely by lines of code, this efficiency would be overlooked.

Ultimately, lines of code are a poor metric for productivity. They fail to account for the quality, maintainability, and impact of the code. Productivity should be measured by the outcomes and value delivered, not by the sheer volume of code produced.

What These Metrics Are Good At:

Having said all that, these metrics are amazing for measuring the impact of GitHub Copilot on your organization. You should have high-level KPIs to measure GitHub Copilot’s impact and the value it brings to your organization. Also, these metrics provide the engagement rate of developers with the tool.

Last week at a conference, someone presented that they compared Sonarqube quality metrics before using GitHub Copilot and after three months of use. They were clearly able to see an impact on code and a massive improvement in code quality.

Another organization did the same with GitHub Advanced Security and saw fewer vulnerabilities being flagged at PR. Their security posture improved with the use of Copilot.

These examples highlight that while individual productivity metrics may be misleading, the broader organizational benefits of GitHub Copilot are evident. By focusing on these high-level impacts, you can better appreciate the true value Copilot brings to your team.

Listening to Developers: The Key to Evaluating GitHub Copilot’s Impact

Developers are straightforward folks who think in binary: 0s and 1s. Want to know if GitHub Copilot is worth it? Just ask them — they’ll tell you straight up. Running a survey is a great way to see if GitHub Copilot boosts their daily work. You’ll get clear answers about its impact on their speed, happiness, and overall productivity. By listening to your developers and valuing their input, you can make smart choices about using AI coding assistants. This way, you ensure the tools you choose really make a difference in their work and keep them happy and productive.

Programming =! Typing

Programming is not just about hammering out lines of code on a keyboard. It’s a thrilling, brain-bending journey into the world of problem-solving! At its heart, software development is a mix of critical thinking, boundless creativity, and seamless collaboration. And this is what GitHub Copilot delivers you that is not captured in these metrics.

GitHub Copilot’s main value is not in providing code but helping developers make the best progress toward their goals. This can be through code suggestions, answering technical questions, or providing hints toward resolving a compiling error. By enhancing the development experience and empowering developers to overcome challenges more efficiently, GitHub Copilot fosters innovation and productivity, ensuring that the focus remains on what truly matters: creating impactful software solutions.

Think About Developer Happiness

A happy developer is a productive developer. Utilizing GitHub Copilot can significantly enhance developer happiness by streamlining repetitive tasks, offering real-time code suggestions, and reducing cognitive load. When developers feel supported by these intelligent tools, they can focus more on creative and complex problem-solving, leading to higher-quality code and innovative solutions. Prioritizing developer happiness through such technologies not only boosts individual productivity but also enhances team collaboration and project success. By integrating AI coding assistants, organizations can foster a thriving, efficient, and innovative tech culture.

Summary

When was the last time you heard a developer go home and excitedly say, “What a productive day! I had a 32% acceptance rate, wrote 200 lines of code, and guess what, I had 59 chat acceptances.” You’re more likely to win the lottery while getting struck by lightning than hearing that!

On the flip side, my developer friends never miss a chance to gripe about their dreadful days: “Useless meetings,” “drowning in documentation,” “basically just copying and pasting code from one project to another,” “writing unit tests was pure torture,” and “I hate DevOps. I had to juggle three programming languages, YAML was a nightmare, and the pipelines were a disaster.”

Use these metrics to gauge the tool’s impact, engagement, and the value it brings to your organization rather than trying to measure developer productivity with it. Focus on understanding how GitHub Copilot is enhancing your team’s workflow, not on measuring their worth.

Remember, happy developers mean exceptional results. Prioritize Developer Happiness for Unmatched Productivity!

About Me

My name is Tajinder Singh, but most of my friends and colleagues call me TJ. Currently, I am working at GitHub as a Solutions Engineer based in the beautiful city of Zurich, Switzerland.

Linkedin Profile: https://www.linkedin.com/in/tajinder-singh-74740115b/

Note: Opinions are my own and not the views of my employer

--

--