Faster Conversational AI
TL;DR — This article is about how to make typical conversational AI faster.
Prerequisites:
- Basic knowledge of Generative AI and LLM orchestration tools such as Langchain & LlamaIndex.
Introduction:
Imagine having a conversation with a friend, but every response takes several seconds or more to arrive. It would be frustrating, right? The same frustration holds true for our interactions with AI systems. In today’s fast-paced world, where instant gratification often reigns supreme, the need for faster conversational AI has never been more pressing.
In this article, we embark on a journey through the evolution of conversational AI, explore the imperative for speed in this field, and introduce you to a groundbreaking application.
About:
A simple webapp with faster AI conversation built entirely using open source libraries.
Tech Stack:
- Next.js (Webapp)
- Socket.IO (Fast duplex communication)
- Faster Whisper (Speech To Text)
- Langchain (LLM Orchestration)
- Coqui TTS (Text to Speech)
Architecture:
Features:
- 4–5x faster than OpenAI Whisper
- Fast Inference (LLM Streaming)
- Cross Platform (Design paradigm works for webapp, metaverse (Unreal Engine 5 and so on)
- Multi Language Support (1100+)
- Train in any language, speak in other languages (Zero-shot Voice cloning)
- Support for multi LLMs — Llama2, OpenAI, Falcon
- Multi TTS Support
- Faster than available famous platforms
- Open Source