Introducing a no-build ultra-lightweight, ChatGPT web interface with webworkers and web components

Julian Harris
TheAIEngineer
Published in
5 min readJan 24, 2024

There are literally dozens of open source ChatGPT front-ends, so why create another one? Because this one is an ultra-light “no build” version, written 100% in javascript, doesn’t require npm or any other hosting other than static html. And I wanted to experiment with webworkers and build something that I can use for future projects 😉

If you’re interested in what I learnt building it, as a software engineer with 30+ years experience creating internet software, read on. Or just download the source and play with it yourself 😊. Caveat: you will need an OpenAI API key.

This is an “AI Level 1” project.

Objective: no frills: as simple as humanly possible

I wanted to create a no-frills front-end to OpenAI’s GPT API, representing the essence of a conversation flow because there are a few ideas that many software engineers won’t be familiar with. Here’s what I distilled it down to, to keep boilerplate down to a minimum:

  • Really, keep it simple. I’ve looked at a lot of code in this space in recent times and it’s so incredibly easy for things to get out of control, which makes it hard to understand and work on. I keep focusing on the question: “how can this be simpler?”. I’m sure it’s not there yet. Suggestions, PRs most welcome!
  • No build unless it’s unavoidable. Javascript / modern web browsers are pretty decent these days. The #nobuild argument is that most frameworks people use were built years ago when the browser was less capable. Breathe a breath of fresh air that this is a repo that doesn’t require npm install 🙂
  • No persistence or authentication. These are useful features but much conventional tech that can be done 1000 ways already. I see less value in adding this right now. Feel free to fork and add of course.

Key principle: streamed text responses

Under the hood, LLMs do this:

  1. Take the user’s question (text prompt) + chat history
  2. Predict the next (possibly partial) word
  3. Repeat (2) until there are no more words.

It can take tens of seconds to get all the words. So a popular way that most chat systems work is they pass the individual words to the user the moment they’re ready, giving a better end-user experience as you don’t have long periods of waiting. See a demo of my UI in action:

This is a different interaction to most text systems in the past — which typically are request, response, and bob’s your uncle. Text streaming requires a little bit of fiddling around to get right. Here’s the basic flow I use. (I use the word “token” to mean “word or partial word”):

  • Take the user request along with the previous conversation.
  • In response to OpenAI streamed responses, create “new token” messages for each token
  • Add it to the current message immediately. Usefully, you can simply concatenate the tokens: all the punctuation and spacing required is built-in. (E.g. you might get tokens like this: [‘Good’, ‘ morning’, ‘ Mr’, ‘ Plop’, ‘py’, ‘,’] — notice the spaces in front of the words, and the breaking of “Ploppy” into “Plop” and “py”. This is a standard technique for breaking human language words into smaller yet meaningful pieces.)
  • Add the current message to message history when it’s finished.
  • Final styling: do some final presentation polish at the end (using markdown). OpenAI’s responses use markdown that needs to be converted to html.

High-level technical structure

  • Web components for both elements on the page: a message input area and a message list area. This gives you css and layout encapsulation: it’s easy to manifest multiple versions of a component if you so choose.
  • Web worker for processing the request and token response stream. Web workers seem appropriate because they’re busy parsing the stream a lot of the time and this frees up any UI interactions from being interupted by this, despite there being few in this particular rendition it can be useful for future. Aren’t I thoughtful? (Contrastingly, web workers are probably overkill for simple request/response integrations see more here.)
  • Controller for communication between web components and web workers (controller.js). This includes wiring up direct links between the web components and routing of messages to and from the web worker.
Core interactions between the different components
Here’s a mermaid sequence diagram of the key interactions. Source in the repo. And no I didn’t create it, thanks ChatGPT. Prompt was “please create a mermaid.js diagram showing the core interactions between these components: messageInput.js, controller.js, messagesArea.js, model-worker.js”

Lessons learned

I’ve worked with a few build-based front-end technologies including Svelte, React, and React Native and if you work directly with javascript there are definitely a few things that I’d be keen to avoid:

Tracking changes inside a web page are still pretty fiddly. Svelte has to be one of the most elegant ways of managing this so I’m tempted to do a build of this system using Svelte to compare.

Web worker 101: web workers are in a completely isolated environment from the main thread. You can only send messages between the two threads. Fun trap: if you do send a reference to a main thread object like a web component, it’ll cheerily make a copy of it, and any work you do in the web worker will be on a copy. I wasted a bunch of time trying to figure out why messages weren’t being received. This wasn’t helped that most AI coders today really get confused by this and keep suggesting that addEventListener() inside a web worker is a useful thing to do. Top tip: it’s not.

Web components cannot encapsulate dependencies. You cannot encapsulate all its dependencies in a no-build context, which means even absolutely essential dependencies have to be managed elsewhere. You may immediately howl in protest when you hear this, but here’s a specific issue I hit: remember, #nobuild. So bye bye webpack, vite, etc.

  • messageInput.js relies on mark.parse()
  • I would want to import mark, but can’t because it’s not available
  • I can refer to mark using but only in the host html file:
    <script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>  
<script src="messagesArea.js" defer></script>

Next steps

Here are some directions this could take, again mostly for personal fun and edification:

  • Client-side persistence: a few chatgpt solutions use local storage to persist state. This is dead easy and probably something I’ll think about.
  • Server-side persistence: this becomes a little less than no-frills
  • Other projects: I’ve really created this to work on my project Memento, which is a way to extract structured content from a conversation. I’ll do a client-side version of this first, and if it merits, explore a server-side version.

--

--

Julian Harris
TheAIEngineer

Ex-Google Technical Product guy specialising in generative AI (NLP, chatbots, audio, etc). Passionate about the climate crisis.