This is part three in my series on running a local LLM and assumes you already have Ollama setup and running, if not, please read part one here.
Running our own Local GitHub Copilot
We’ve been exploring hosting a local LLM with Ollama and PrivateGPT recently. So far we’ve been able to install and run a variety of different models through ollama and get a friendly browser interface with PrivateGPT, but why stop there?
GitHub Copilot is a premium extension for VSCode that provides an in-IDE chat window as well as smarter autocomplete suggestions based on AI. The individual plan currently costs $10 a month (though there are plans to increase this) and an enterprise license costs $39 per user per month. It’s worth noting that the individual license does not exclude your data from its training by default.
We can replace this and use our own local model with the help of extensions. There are multiple on the marketplace that claim to do the same thing, but we’ll be taking a look at CodeGPT (for VSCode) and Continue (for both VSCode and JetBrains).
Prerequisites
There are LLMs made specifically for coding, while we can get away with using the default mistral, its better to use an optimised one such as deepseek-coder.
ollama pull deepseek-coder
ollama pull deepseek-coder:base # only if you want to use autocomplete
ollama pull deepseek-coder:1.3b-base # An alias for the above but needed for Continue
CodeGPT
Install the extension from here
Once installed, click on the CODEGPT icon in the sidebar to see this window
We want to “Select A Model” at the top and select Ollama, then “deepseek-coder” in the “Select or search a model”
At this point we now have basic chat functionality, either ask it a question or highlight some code in an editor window and hit one of the buttons (like “Explain selected code”). I am blown away at how fast this is, I’m running this model with 64GB RAM and an RTX 3080 so your mileage may vary, responses are generated virtually instantly.
Autocomplete is off by default, and I’ve found this to be less accurate than Copilot although there are some settings to fiddle around with. If you want to enable it, click the hamburger menu in the top left and hit “Autocomplete”. Toggle it on and select “deepseek-coder:base” as our model. Feel free to play around with Max Tokens and Suggestion Delay, but be warned that increasing Tokens will substantially increase resource usage and may freeze ollama. I found that ~1500 tokens is safe while avoiding completely useless suggestions.
It really is as simple as that. This works offline as well, disconnect from the internet and everything still works.
Continue
VSCode
Install the extension here.
Selecting the “< C D _ “ icon in the sidebar will show this screen
Click the add button (“+”) and select “Ollama” and then “DeepSeek-Coder” (we installed the 1b model earlier). Ensure the following exists in the config.json
editor window that pops up
I’m using numThreads
8 here to get faster responses because I have the cores to spare. Feel free to increase/decrease or leave this out entirely.
Scroll down to tabAutocompleteModel
and set the following
Now we’ve configured both Chat and Autocomplete functionality of Continue to use the same deepseek-coder model. This model should give faster results and requires downloading less components.
If you want to stick with the recommended Starcoder model, then run ollama pull starcoder:3b
and change the model
in tabAutocompleteModel
to be starcoder:3b
. There are a bunch of other options you can fiddle around with in this file, refer to the documentation for details.
IntelliJ
Go to Settings -> Plugins and search for “Continue”. Install it and the same icon as before will show up in the right hand sidebar.
Note: If using JetBrains Gateway you’ll want Plugins on the Client (not the Host)
Continue uses the same configuration settings for VSCode and JetBrains, so configuring once will affect both IDEs. While mine still defaulted to GPT-4 Vision, the option for DeepSeek-1b is already accessible in the dropdown
Continue has less features in JetBrains IDEs, autocomplete is a feature that’s missing but also context menu integrations are lacking when compared to VSCode. The plugin itself is still very in development so I’m looking forward to see its growth.
Conclusion
We’ve looked at two different extensions that bridge the gap between our IDEs and Ollama, effectively replacing GitHub Copilot’s most useful features. In my experience Continue is far more performant than CodeGPT at auto suggestions, though I prefer CodeGPT in terms of aesthetics and ease of use. Continue also has the strength of working in both VSCode and IntelliJ. Having both installed at once is not advised and CodeGPT started crashing after I installed Continue, so pick one and stick with it.
We’re seeing more and more extensions pop up every day that act as bridges between the IDE and the LLMs, its a really exciting time to live in. Have a play around with these, see how it compares for you against the official GitHub Copilot.
About the Author:
Jack Reeve is a full stack software developer at Version 1.