GitHub copilot — Lessons
Pair programming (sometimes pragmatic pairing) has been a default habit for me in most of my projects in the past 8+ years, and it’s become my second nature. Not practicing it is the exception rather than the rule. When the ChatGPT revolution took the world by storm in 2022 and the subsequent code generation LLMs became abundantly available, many claimed it will obviate the need of pair programming (hell, it’s even predicted it will replace all programmers). There’s too much doom and gloom amongst the SWE community due to the likes of Github Copilot and Devin.
Microsoft (the owner of GitHub) saw the opportunity to leverage (and monetise) the hype on the world’s largest code hosting platform. It was a no-brainer, if they hadn’t someone else would have. The possibility immediately got me hooked. I enrolled for the trial waitlist and got access a few weeks later. And then I got a subscription too! It blew my mind initially and it gave me some jaw dropping moments. It’s now a useful tool in my shed of programming tools. But, when the novelty wore off a bit, I could see some of its flaws and annoyances.
Below are some of the pros and caveats of using GitHub copilot (or any other LLM code generation platform)
- GitHub Copilot’s performance in writing code from scratch is underwhelming. It excels at solving LeetCode problems, even some hard ones on the first attempt, but it falls short when dealing with real-world programming tasks. Most professional coding involves complex business logic spread across multiple files and even systems. Copilot struggles with these specifics, even when given detailed prompts. The level of detail required in a prompt for Copilot to accurately produce such programs is so high that a programmer might as well write the code themselves.
- Copilot is very useful to scan existing code for any errors or missed edge cases.
- Copilot (that uses GPT 4 underneath) is decent in generating tests from a piece of code. It tries really hard, though falls short (Claude 3.5 Sonnet excels at it). It’s not accurate, and as seen in the below cast, it misses some cases for the invalid scenarios.
- Copilot is useful while understanding unfamiliar or obfuscated codebases. In many of my projects I have (unwillingly) inherited massive codebases with no clear documentation or tests. Copilot is useful in unravelling such code. Take the below deliberately obscure code as an example.
- Copilot is useful while learning a new programming language. It’s not a substitute for books or courses, but it helps while coding in a brand new syntax and especially to learn the idiomatic aspects of a language. I used it to learn about continuations in Kotlin and list comprehensions in Python, something I found difficult to learn from books. The omnipresence of Copilot in IDE makes it even better in this particular case.
- Copilot does especially well in languages that have a lot of open source codebase available. It’s obvious since it’s powered by LLMs which need tons of data. JavaScript, Python and Java have the biggest codebases, while the likes of Haskell or Idris would not. That makes Copilot subpar for less used or niche technologies as its emergent trait can only go so far.
- Copilot can be overly talkative and distracting when enabled by default. It frequently offers suggestions, many of which are low-quality. It doesn’t care about the surrounding context. I would rather like it to shut up if it’s not sure of the context. It tends to provide quick, short-term fixes rather than well-reasoned solutions. I’m concerned that it may significantly atrophy problem-solving and critical thinking skills, especially among newcomers to the field.
There are many more points, but relatively minor annoyances compared to the ones mentioned above. Do you agree or disagree? Please leave a comment and I would be happy to engage in a constructive dialogue.