Faster Conversational AI

1 min readOct 3, 2023

TL;DR — This article is about how to make typical conversational AI faster.

Prerequisites:

Basic knowledge of Generative AI and LLM orchestration tools such as Langchain & LlamaIndex.

Introduction:

Imagine having a conversation with a friend, but every response takes several seconds or more to arrive. It would be frustrating, right? The same frustration holds true for our interactions with AI systems. In today’s fast-paced world, where instant gratification often reigns supreme, the need for faster conversational AI has never been more pressing.

In this article, we embark on a journey through the evolution of conversational AI, explore the imperative for speed in this field, and introduce you to a groundbreaking application.

About:

A simple webapp with faster AI conversation built entirely using open source libraries.

Tech Stack:

Next.js (Webapp)
Socket.IO (Fast duplex communication)
Faster Whisper (Speech To Text)
Langchain (LLM Orchestration)
Coqui TTS (Text to Speech)

Architecture:

Features:

4–5x faster than OpenAI Whisper
Fast Inference (LLM Streaming)
Cross Platform (Design paradigm works for webapp, metaverse (Unreal Engine 5 and so on)
Multi Language Support (1100+)
Train in any language, speak in other languages (Zero-shot Voice cloning)
Support for multi LLMs — Llama2, OpenAI, Falcon
Multi TTS Support
Faster than available famous platforms
Open Source