Member-only story
Technical Architecture for Digital Avatar Conversational Systems
AI Talking Avatars Day 2: Linly-Talker, CosyVoice2 and bonus DanteAI (not-talking-head)
Taking a deep dive into business use cases and proposed technical architecture for AI talking avatars, namelyexplore Linly-Talker and CosyVoice2. and will trial DanteAI
I must confess, today’s series of talking head videos made me fairly uncomfortable:
The design philosophy of Linly-Talker:
create a new form of human-computer interaction that goes beyond simple Q&A. By integrating advanced technologies, it offers an intelligent digital human capable of understanding, responding to, and simulating human communication.
Breaking down talking heads, we have:
- Real-time interaction, which includes accepting user input (voice); real-time speech recognition and video captioning
- Parsing/ transcribing, with memory to accommodate multi-turn dialogue
- Rendering the mouth/ facial expressions + sync with audio output via libraries like Wav2Lip/Wav2Lipv2 /…