Deciphering the buzz behind AI pair programmers (part 2 of 4)

Vedant Agrawal
4 min readFeb 10, 2023

--

This article is the second in a 4-part series on AI pair programmers. If you haven’t yet, check out part 1 here. It covers the history of AI pair programmers. This article covers what an AI pair programmer is and subsequent articles cover the issues and where value might accrue!

What does an AI pair programmer do?

AI pair programmers are tools that integrate with a software developers IDE, like Microsoft Visual Studio, Microsoft Visual Studio Code, IntelliJ, PyCharm, WebStorm etc and support a wide range of languages including Java, JavaScript and Python.

At its most basic level, an AI pair programmer suggests completions of a line of code. This was basically what Microsoft Visual Studio’s Intellicode used to do. GitHub Copilot, Amazon CodeWhisperer and Tabnine take things a few notches ahead though, and can output entire chunks of code or even complete functions basis the last few lines of code (CodeWhisperer does talk about needing the last 10–15 lines of your code to suggest new code). Both Copilot and CodeWhisperer give the developer a few options of code snippets for them to choose from, which differ in logic, variables used etc. Users can toggle between the option and choose the one that bests suits their programming style. In addition to this functionality, AI pair programmers can also help developers write unit tests and comment their code. The code commenting feature would work similarly to ChatGPT’s ‘fill-in-the-blanks’ feature where a user types in a comment and the pair programmer will fill in the best representation of the code it is referring to.

The AI pair programmers get plenty of things wrong in the suggested code — sometimes introducing functions that don’t exist in the code or variables that the user is not using. As Amazon explains in its webinar for CodeWhisperer though, the code suggestion is just meant to be, a suggestion, and users are expected to rectify parts of the recommended code.

Amazon has not revealed a lot about how CodeWhisperer works under the hood, and my understanding of these tools is limited to how Copilot works. Copilot is powered by OpenAI’s Codex, which is a Large Language Model, or LLM. LLMs are ingested with large volumes of training data, infer certain statistical patterns and then output data basis what it has ‘seen’. They are trained to guess missing words in a piece of text or even complete paragraphs. Codex has used GitHub’s entire public code archive, consisting of tens of millions of code repositories, as its training data and powers Copilot with that. Amazon has revealed that it has trained its model on public code on GitHub as well as its own internal Amazon code.

It is worth noting that Copilot purely just outputs code, it doesn’t check for the quality of the code or even compile the code to see if it works of not. We’ll cover more of these limitations in the next article.

Double-clicking on Amazon CodeWhisperer:

Basis Amazon’s demo of the CodeWhisperer product, it is worth noting some of the similarities and differences with Copilot:

What’s the same: Similar to Copilot, CodeWhisperer reads the last few lines of code (15–20 for CodeWhisperer) to suggest new, boiler plate code and keeps the developer within the IDE, minimizing distractions (reducing the need to go to Stack Overflow or similar forums for help) and increasing developer productivity. It also sends the code to AWS (Copilot sends the code to Microsoft), but users can choose to disable that feature, which I suspect most developers will do! Also similar to the GithHub Copilot strategy, CodeWhisperer unabashedly tries to increase the usage of AWS and Amazon products, with the accuracy of code being much higher and ‘first class support and best practices’ if the developer is coding for AWS or using Amazon products like S3.

There are differences with Copilot, the major ones being:

1/ IP and licensing: CodeWhisperer flags to the user is the code being suggested is identical to code in its training data and the appropriate licensing requirements. Developers can choose to attribute or not use the code altogether. Users can also ask CodeWhisperer to not show licensed code to them altogether. We’ll see in subsequent articles that IP infringement has got Copilot in quite a bit of trouble!

2/ Code scan: CodeWhisperer recognizes that the model could have picked up bad programming practices and allows users to scan their entire code and tell them if there are common security issues

3/ Removal of bias: Probably one of the biggest issues with AI in general is the creation of bias, typically because of the training data itself. CodeWhisperer has internal checks that avoid showing biased code to the developer (e.g., code that biases against certain people)

While there isn’t a clear winner between Copilot and CodeWhisperer, it seems like Copilot beats CodeWhisperer when it comes to flexibility and breadth, although CodeWhisperer is better at writing code for Amazon’s own APIs.

Double-clicking on Tabnine:

Tabnine differs from GitHub Copilot as it does not send any of the user’s code to its own servers for the AI system to work and doesn’t have a button that users can toggle on or off. The system works on a user’s local system.

Great, so now we’ve understood the history of AI pair programmers and what they do. The next article covers the issues with these tools (find it here) and the last one talks about where value might accrue in the future (see that here).

--

--