A Guide To The Background, Use-Cases, And Components Of Coding Assistants

Justin Milner
7 min readMar 18, 2024

--

This article will provide an overview of why coding assistants are exploding in popularity in 2024, use cases they currently serve, main components contributing to the value of coding assistants, and a fundamental security limitation which divides the market.

Part 2 details the top coding assistant companies/platforms in the field.

Note: This series focuses on coding assistants, not AI software engineers.

Part 1 Chapters:

  1. A brief background on large language models (LLMs)
  2. Use cases
  3. Components of coding assistant performance
  4. Self-hosting

A Brief Background on LLMs — why are the models so large?

What changed in AI research within the last 6 years?

  1. A new model architecture was released in 2017 (the transformer).

Transformers perform very well with sequential data (such as language). Also, they can be highly parallelized — which means they can be scaled up efficiently with hardware that specialize in parallel processing (GPUs, TPUs).

2. Scaling laws refined.

Most modern AI architectures, including transformers, are composed of large collections of adjustable parameters. You could think of the parameters as dials, and if the dials are set correctly, the model performs well.

It turns out that, generally, the more parameters there are, the better the performance. This concept was accepted in the AI research world prior to the current boom, but it wasn’t fully grasped how effective scaling could be. New model capabilities emerge as parameters are scaled.

Experiments in recent years show a log-linear relationship between the number of parameters in a model and its general capabilities — or in other words: there are diminishing returns as scale increases, but no ceiling.

How has the AI industry changed in the last 6 years?

The highest performing models are now closed-source.

Investment into these models is massive ($100M+ per base model) — too large for single-institution research grants. Only commercial enterprises are able to sustainably invest funds of this amount. As a result, the state of the art models are closed-source.

However, while open-source will likely always lag behind closed source in this environment, the gap does seem to be closing slightly — Knowledge is dispersing, the number of important players in the field is growing, and methods of increasing performance other than scale are improving.

Which companies lead the LLM race?

OpenAI, partnered with Microsoft, has led the race thus far with their GPT 3.5/4 models.

Google disappointed early, but their recent release of Gemini 1.5 is near GPT-4 in most benchmarks (notably worse in coding however).

Anthropic (funding includes $4B from Amazon and $2B from Google) has become a major player in the race recently with their release of Claude-3, which outperforms Gemini 1.5 in most benchmarks, but still falls short of the most recent GPT-4 version in all comparable benchmarks.

Meta has spearheaded much of the open-source community by publicly releasing large base models, notably Llama 1 and 2.

Use-Cases:

When we say “coding assistant”, what are we actually referring to?

High level classes:

Generally, we can split each use case into understanding or generating code/documentation — or a mixture of each.

Note: In all cases, supervision is required. Coding assistants are … assistants. They do not currently exhibit planning, uncertainty acknowledgement, or goal-oriented behaviors in the way that humans do — in other words, they make mistakes that humans typically would not.

Specific classes:

Chat: Bare-bone chat is the most open-ended form of a coding assistant. On the positive side, chat users have unrestricted freedom to control interaction with the model. On the negative side, chat users have to carefully design/create prompts, ensure sufficient context is provided to the model manually, and context switch more often.

Interactive documentation search: Rather than searching for explicit key words or patterns, programmers can ask the models natural language questions such as “where is the value for this variable set?” or “what is the definition for this API?”.

Code review/linting/smelling: Coding assistants can act much like a merge request reviewer. They can simply point out that there may be a problem, or also generate replacement code.

Documentation + comment generation: Summarization and translation are some of the strongest task areas for LLMs. Commenting and documentation are each a combination of translation from code to natural language, and summarization of the functionality of the code — with these facts in consideration, this is an especially strong category for coding assistants.

Auto-completion: In the cases where a programmer knows what they want to code, but does not know the specific syntax, number of parameters, attribute name, or other details, autocomplete can often fill that gap. The more logic that is involved, the less likely the assistant is to produce a desired solution.

Code Translation: LLMs are hyper-polyglots. Coding assistants can assist in translating code from one language/framework/version to another.

Test Generation: Generally, the more narrow and complete a task’s context is, the better LLMs will perform— this type of environment can typically be provided in software testing tasks. Given this characteristic, test generation is one use-case receiving extra focus from coding assistant companies.

Components of Coding Assistant Performance

There are many factors contributing to whether using a particular coding assistant is a positive experience or not. Some of the most important attributes from a user perspective are:

  • Quality at which the task is performed
  • Rate of performing the task
  • Value of the task
  • Ease of use
  • Security
  • Cost

The components contributing to these factors can be broke into four high-level categories:

1) The quality and characteristics of the base model

There are many considerations for base model choice. The trade off between capabilities and inference rate, the relative strengths of the model (Model performance is not homogeneous across languages or tasks), and the context length of the model are among the many characteristics that should be considered given the coding assistant task in mind.

2) Whether/How the base model has been fine-tuned

RLFH is typically used to teach the model how it should behave. Fine-tuning can be performed to provide additional/specialized knowledge to the model, to narrow its behavior in particular categories, or to teach it how to utilize external software. Coding models are typically taught to utilize a conversational markup language format during these stages.

3) Prompt quality

Prompt quality can be split into two high-level categories:

Communication of the task: Most tasks can be completed in many different ways. Further, tasks can be delivered/presented to the user in many different ways. Communicating the desired many-dimensions of the desired intentions and result to LLMs is a challenging, but very malleable and function. Tone, clarity, constraints, specificity, and structure are examples of important communication parameters.

Information/context availability: Regardless of how effectively the task is communicated, if the model does not have the information required to complete the task, it will be unable to do so effectively.

4) Additional tooling

Tooling refers to additional software built around the model. There is much room for creative innovations in this area. I’ve broken tooling into the following categories:

Code filtering: Symbolic filtering can be used to reduce the number of redundant prompts sent, prevent the model from generating repeated lines, or the guide the model into behaving in any particular way given a certain pattern of generation by the model.

Prompt Recipes/Code Lenses: Building a quality prompt requires time and focus. Predefined templates come in handy for commonly used and/or complex prompts.

Gathering Context: Coding assistants will often automatically include the current file, current package, or a similar scope into the context. Some tooling allows for external information retrieval, often using a vector db search, for relevant files from a repository, code-base, the internet, or some other information source. Tooling can also help the user emphasize certain portions of the context via inline discussion or code highlighting.

Agent-like Behavior: Current LLMs exhibit some emergent agent-like behavior, but generally do not exhibit strong characteristics of planning, uncertainty acknowledgement, or goal-oriented behavior. Tooling can be used to make the coding assistant behave more like an agent. Some strategies include prompt chaining, cyclical prompting/feedback, dueling LLMs, external tool use, and conditional prompting.

External Software Interaction: This section is extremely broad and full of potential. LLMs can be used as a natural language interface for any tools/APIs which they have been effectively trained to use. OpenAI’s “GPT Store” is built off this idea.

Self-Hosting

As mentioned earlier in this article, the highest performing models are closed source — This means that if a coding assistant uses a closed source model, it must send source code and/or documentation to the external servers where the private model is hosted.

For individuals, this may be acceptable. For many corporate enterprises, this exposure of IP to the external internet is unacceptable.

Given this restriction, IP-paranoid companies must internally host LLMs instances on their private network — This narrows their coding assistant platform choices, model selection options, and complicates the process of setting up the coding assistant service infrastructure.

Components of a minimal self-hosted coding assistance service

A basic coding assistant service will require an IDE plugin, a LSP service, a load distribution service, and instances of the models deployed/sharded across GPUs. There are separate open-source projects available for each of these components. Alternatively, there are also coding assistant platforms that seek to fill this problem with all-in-one enterprise products.

Thanks for reading! Please leave a like if you enjoyed.

In part 2, I’ll overview the top coding assistant options available in mid-2024, and outline my experiences with each.

Link to part 2: https://medium.com/@justinmilner/a-guide-to-the-booming-landscape-of-coding-assistants-83b48b4a16ba

--

--

Justin Milner

Using logic and data to understand the things I’m curious about. Youtube: @aiwithjustin2897/ LinkedIn: @justin-milner-b190467b