Agency AI (Agen.cy) presents the Realtime Voice AI and Multimodal Hackathon in San Francisco
Presented by Agency AI (Agen.cy and AgentOps)
Follow us on Twitter and LinkedIn
280 hackers signed up to build what’s possible with the latest advancements in multimodal AI.
This is a sneak peek at the new age of the AI-first Interface.
Here’s what we saw at the @aiengfoundation Realtime Voice AI and Multimodal Hackathon in San Francisco (🧵):
1/ Synergai
Navigate and control user interfaces just with voice commands
AI accessibility with voiceJakub Neander and @MMiszy
2/ AI3D
Create generative 3d objects with LCM + TripoSR with FAL on the Vision Pro
🥇1st Place $3500 or Apple Vision Pro
3/ Visual Vibe
Generative AI asset prompting for DJs playing live shows
4/ BillFox
AI voice agent to automatically negotiate medical bills
5/ Short-form video generation
AI videos with talking avatars and generative music to broadcast your brand
6/ Media Assistant
Combining Moondream and @elevenlabsio to auto-annotate videos
7/ Wake me up
AI alarm clock (cloned Taylor Swift) that wakes you up with personalized conversations
🥈2nd place 4090 GPU or $2,000 cash
@fai_ne and @konrad_gnat
8/ Interlinked
Live auto-translation with image generation baked in.
Google Translate, but AI.
@aditya_advani, @jacobgreenway, and @snow_huo
9/ iPod (interactive podcast)
AI-enabled podcasts that let you pause the episode and start a conversation with the host.
🥉3rd place ($600 or a mini projector)
10/ Redactable AI
AI agent that sits on Zoom calls and automatically purges PII from the transcripts
11/ Financialbird
AI voice broker that you can call and get portfolio recommendations from
12/ UForm 4 Swift
Family of pocket-sized multimodal AI models (small enough to run on watches) re-written in ONNX and ported to Swift
13/ Voice Canvas
Real-time SDXL Image generation controlled by voice
14/ Masterpiece mouthpiece
Art museum exhibits explained by AI avatars
15/ FaceTime recorder
Dialogue transcriber for FaceTime calls
16/ Realtime video super-resolution
3x Upscaling live webcam footage up to 15 FPS on CPU (purely running on client) using ESPCN
17/ Podcast AI
Summarize podcasts into key points and sections
18/ [Input/Retrieval]
Hardware device for easily saving information (i.e. voice, calendar, email) and easily categorizing + retrieving it
19/ Dreamweaver
AI dream journal psychotherapist to help you understand your dreams
20/ Realtime Sora
Rapidly generate AI images in seconds
21/ Chirpy Chow
AI call center for restaurants to place menu orders without paying exorbitant fees to food delivery apps
Huge thanks to hackgoofer, @aiengfoundation for hosting + sponsors
@trydaily, @Oracle, @DeepgramAI, @fal_ai_data, @FireworksAI_HQ, Cloudflare
Visit Agen.cy for more agents, generative AI, and demos