AI for Engineering Teams. Measuring GitHub Copilot Impact

Published in

Akvelon

5 min readNov 15, 2023

Kudos to Oleksii Smirnov who prepared this post describing our approach to measuring GitHub Copilot's impact!

Our AI journey goes on. In the realm of modern software development, GitHub Copilot represents a revolutionary AI tool with significant potential to boost developer productivity. Yet, when it comes to integrating GitHub Copilot into a project or across an entire company, several important questions emerge. Teams begin to question the performance enhancements resulting from the integration of Copilot and how this alters their workflow, particularly in terms of software delivery speed. Additionally, they need to reevaluate the code quality produced with Copilot’s assistance.

To answer these questions, it’s important to establish a set of metrics and conduct measurements across all projects where Github Copilot has been implemented.

In this article, I will outline our approach to evaluating the impact of GitHub Copilot.

GitHub Copilot Usage Feedback

While performance metrics are crucial for gauging GitHub Copilot’s impact on your project, it’s important to recognize that they might not encompass all aspects. There are areas where the influence of Copilot might be harder to quantify using metrics alone. This is why gathering personal feedback from individual developers and the entire development team is equally vital.

When conducting a feedback survey on GitHub Copilot usage, consider including the following areas and questions.

GitHub Copilot Performance Metrics

It’s vital to assess the performance impact of implementing GitHub Copilot in a project. This evaluation helps determine any enhancements in the team’s efficiency and productivity. Various methods exist for measuring this impact, each offering different benefits and constraints.

1. Team Velocity Metrics

Examples: Story Points, Hours, Number of Completed Tasks/Bugs within a specific time interval.
Method: Compare these metrics over a defined period with and without Copilot usage.
Considerations: While this approach is relatively easy to implement, it may not provide definitive results. Other factors, such as task complexity, team composition, and unforeseen issues, can also influence performance, making it challenging to isolate Copilot’s impact.

2. Comparison with Two Different Teams

Method: Set up two separate teams and assign them to a small project. One team uses Copilot, while the other does not.
Outcome: This approach can yield clearer results on Copilot’s performance impact. However, it’s important to note that these results may be somewhat “synthetic” and may not perfectly replicate the impact observed in a real-world project scenario.

3. Focus on the Most Repeatable Tasks

Method: Identify the most frequently recurring tasks on the project and request developers to perform a set number of them both with and without Copilot.
Outcome: While this approach may not provide an overall performance impact in quantitative terms, it offers specific and tangible data on Copilot’s influence on the most common tasks in the project.

You can refer to the following table as an example of work types you might measure in your project.

Measurement results could be later saved into the following report for review and comparison of results.

Beyond the table mentioned earlier, you could gather additional general metrics to more comprehensively illustrate Copilot’s performance impact on your project. However, it’s crucial to allocate extra time to consider other external factors that could influence your team’s overall performance, separate from Copilot’s effects.

Example Copilot Performance Report

This is a real example of a Copilot Performance Report from one of our projects, showcasing the tangible impact and invaluable insights derived from Copilot’s involvement in our workflow:

Full PDF version of the report: https://drive.google.com/file/d/1Vh_Rfwdpa3cdiKpB-cOSYm-2h6MP_kBr/view?usp=sharing

Conclusion

The integration of Github Copilot into our projects has marked a significant milestone in our development journey. The comprehensive approach of gathering both feedback and performance metrics has provided us with invaluable insights into the tool’s impact.

Through rigorous measurement, we’ve been able to quantify the improvements in our development process. Metrics such as team velocity, bug density, and code review time have shown tangible advancements, demonstrating Copilot’s positive influence on our productivity and code quality.

The qualitative feedback from our development team has been equally vital. Their firsthand experiences have shed light on Copilot’s effectiveness in specific tasks, highlighting areas where it excels and identifying potential areas for further optimization.

This combined approach has not only bolstered individual productivity but has also fostered a more collaborative work environment. Team members seamlessly leverage Copilot’s capabilities to enhance workflows and deliver high-quality code.

As we move forward, this feedback-driven approach will remain a cornerstone of our development process. Continuously gathering insights and performance metrics allows us to refine our usage of Copilot and maximize its benefits across our projects. With Copilot as a trusted ally, we are well-equipped to tackle future challenges and continue to elevate the quality and efficiency of our development efforts.

AI for Engineering Teams. Measuring GitHub Copilot Impact

GitHub Copilot Usage Feedback

GitHub Copilot Performance Metrics

Example Copilot Performance Report

Conclusion

Written by Sergei Grebnov