Jack Reeve
Version 1
Published in
5 min readApr 2, 2024

--

This is part three in my series on running a local LLM and assumes you already have Ollama setup and running, if not, please read part one here.

Running our own Local GitHub Copilot

We’ve been exploring hosting a local LLM with Ollama and PrivateGPT recently. So far we’ve been able to install and run a variety of different models through ollama and get a friendly browser interface with PrivateGPT, but why stop there?

GitHub Copilot is a premium extension for VSCode that provides an in-IDE chat window as well as smarter autocomplete suggestions based on AI. The individual plan currently costs $10 a month (though there are plans to increase this) and an enterprise license costs $39 per user per month. It’s worth noting that the individual license does not exclude your data from its training by default.

We can replace this and use our own local model with the help of extensions. There are multiple on the marketplace that claim to do the same thing, but we’ll be taking a look at CodeGPT (for VSCode) and Continue (for both VSCode and JetBrains).

Prerequisites

There are LLMs made specifically for coding, while we can get away with using the default mistral, its better to use an optimised one such as deepseek-coder.

ollama pull deepseek-coder
ollama pull deepseek-coder:base # only if you want to use autocomplete
ollama pull deepseek-coder:1.3b-base # An alias for the above but needed for Continue

CodeGPT

Install the extension from here

Once installed, click on the CODEGPT icon in the sidebar to see this window

CodeGPT homepage

We want to “Select A Model” at the top and select Ollama, then “deepseek-coder” in the “Select or search a model”

Setting up our provider and model

At this point we now have basic chat functionality, either ask it a question or highlight some code in an editor window and hit one of the buttons (like “Explain selected code”). I am blown away at how fast this is, I’m running this model with 64GB RAM and an RTX 3080 so your mileage may vary, responses are generated virtually instantly.

Autocomplete is off by default, and I’ve found this to be less accurate than Copilot although there are some settings to fiddle around with. If you want to enable it, click the hamburger menu in the top left and hit “Autocomplete”. Toggle it on and select “deepseek-coder:base” as our model. Feel free to play around with Max Tokens and Suggestion Delay, but be warned that increasing Tokens will substantially increase resource usage and may freeze ollama. I found that ~1500 tokens is safe while avoiding completely useless suggestions.

Autocomplete settings

It really is as simple as that. This works offline as well, disconnect from the internet and everything still works.

CodeGPT showing an autocomplete with code explanation in the sidebar

Continue

VSCode

Install the extension here.

Selecting the “< C D _ “ icon in the sidebar will show this screen

Fresh Continue install

Click the add button (“+”) and select “Ollama” and then “DeepSeek-Coder” (we installed the 1b model earlier). Ensure the following exists in the config.json editor window that pops up

Continue — adding the ollama model

I’m using numThreads 8 here to get faster responses because I have the cores to spare. Feel free to increase/decrease or leave this out entirely.

Scroll down to tabAutocompleteModel and set the following

Continue — Autocomplete settings

Now we’ve configured both Chat and Autocomplete functionality of Continue to use the same deepseek-coder model. This model should give faster results and requires downloading less components.

If you want to stick with the recommended Starcoder model, then run ollama pull starcoder:3b and change the model in tabAutocompleteModel to be starcoder:3b. There are a bunch of other options you can fiddle around with in this file, refer to the documentation for details.

Continue sidebar showing an explanation of code and suggesting new code

IntelliJ

Go to Settings -> Plugins and search for “Continue”. Install it and the same icon as before will show up in the right hand sidebar.

Note: If using JetBrains Gateway you’ll want Plugins on the Client (not the Host)

Continue plugin in IntelliJ

Continue uses the same configuration settings for VSCode and JetBrains, so configuring once will affect both IDEs. While mine still defaulted to GPT-4 Vision, the option for DeepSeek-1b is already accessible in the dropdown

Continue in IntelliJ

Continue has less features in JetBrains IDEs, autocomplete is a feature that’s missing but also context menu integrations are lacking when compared to VSCode. The plugin itself is still very in development so I’m looking forward to see its growth.

Conclusion

We’ve looked at two different extensions that bridge the gap between our IDEs and Ollama, effectively replacing GitHub Copilot’s most useful features. In my experience Continue is far more performant than CodeGPT at auto suggestions, though I prefer CodeGPT in terms of aesthetics and ease of use. Continue also has the strength of working in both VSCode and IntelliJ. Having both installed at once is not advised and CodeGPT started crashing after I installed Continue, so pick one and stick with it.

We’re seeing more and more extensions pop up every day that act as bridges between the IDE and the LLMs, its a really exciting time to live in. Have a play around with these, see how it compares for you against the official GitHub Copilot.

About the Author:
Jack Reeve is a full stack software developer at Version 1.

--

--