Can ChatGPT be your BFF code companion?

Venu Vasudevan
4 min readJan 26, 2023

--

The plot. Eric Clapton once said ‘Good Blues only takes 3 things — a (wo)man, a guitar and a sunset’. I have always believed this trinitarian theory about programming as well — ‘Good programming only takes 3 things — a (hu)man, Emacs and a human level code companion’. This trinity is like extreme pair programming with a person in the hottest new programming language .. English. The person in this case being a program.

Today we are far from this vision in programming — needing Github, Gists, W3Cschool, language playgrounds, replits, StackOverflow, various IDEs with learning curves and superfluous UX bling, and so on.

The question then is — is chatGPT (or synonym) good enough to be your sole code companion (your extreme programming BFF)?

If not what does it already replace, and what is yet to come?

The exercise. What got me off my new year’s resolution of not adding to the ‘noise’ about chatGPT, was this itch I needed to scratch. And it so turned out that the solution has a) the level of difficulty of a freshman CS undegrad assignment and b) with a number of bits I know that I was either unfamiliar or rusty on (Langchain, vector store APIs and such)— so a good enough exercise to test how helpful chatGPT could be vs the alphabet soup of tools captured above.

My Use Case. I keep collecting papers on particular themes (20–30 of them at a shot) and postpone reading them because I am unable to block out a contiguous block of time. In particular, over the holidays, I collected a bunch of papers on Neurostimulation to better understand if Musk’s Neuralink was a game changer, or commercializing known state-of-the-art. With very little Neuro vocabulary under my belt, this job of reading the reading list kept getting deferred.

Thus the thought — write a simple semantic search engine and Question Answering System that works exclusively with my corpus of papers.

Once the above is in place, I could query the papers and read them the way I browse the web (piecemeal based on curiosity of the moment) rather than like a book (from cover to cover).

Solution. You need a few pieces — simplified document splitting on the corpus to simplify indexing, modern indexing (FAISS or better), a vector store to speed up query processing, and then a conversational overlay on top of the shards with references back to original content — to make the consumer query process intuitive. You could do this a lot of ways, but one way is to get all the pieces from OpenAI, and to chain them together with prompt chains.

Langchain is a new-ish package that simplifies the melding of LLMs and traditional document stores, thus making it a good foundation for writing said custom semantic search engine. The core idea it simplifies is the notion of prompt chains (think of it as a 21st century version of a rules engine, but where the rules work with LLMs). In particular, it offers something called a Data Augmented Generation prompt chaining that simplifies much of the pipeline below with the default assumption that all the services (text splitting, summarization, indexing etc) come from OpenAI. Obviously natural language interfaces come for free, as does multi-language, as does creativity of response (if you set the OpenAI temperature to something other than 0).

What did chatGPT do for me?. Instead of reading OpenAPI and Langchain documentation I decided to learn them the ‘socratic way’ by simply asking chatGPT as I went along. Its answers about concepts (as well as simple things like how to program openAI keys and optimize cost of openAI calls) were surprisingly cogent. The code it wrote was 80-for-20 good, and I found myself not going to StackOverflow for the quick bite of things-I-knew-but-forgot. Surprisingly, its integration with Github was poor, and I ended up browsing Langchain source code for clarification (this is something Microsoft will fix in a jiffy with the ‘new deal’). A summary of the things it replaced (or not) for me (in the 15–20 chatGPT prompts that constituted our pair programming) are as below.

So .. on the trinitarian scale. We’re not in nirvana human companion state yet, but getting to a useful intermediate state of ‘knowledge accelerated programming’. However my bet is that chatGPT (by whatever LLM name) will get to a point where code specific companions (like CoPilot) may either not be needed, or blend with chatGPT to where they have a human persona to them, that co-develops software with you ‘as a person’ (not as a browser plug-in).

--

--