Agency AI (Agen.cy) presents the Realtime Voice AI and Multimodal Hackathon in San Francisco

Agen.cy
4 min readAug 26, 2024

--

Presented by Agency AI (Agen.cy and AgentOps)

Follow us on Twitter and LinkedIn

280 hackers signed up to build what’s possible with the latest advancements in multimodal AI.

This is a sneak peek at the new age of the AI-first Interface.

Here’s what we saw at the @aiengfoundation Realtime Voice AI and Multimodal Hackathon in San Francisco (🧵):

1/ Synergai

Navigate and control user interfaces just with voice commands
AI accessibility with voice

Jakub Neander and @MMiszy

Twitter source

2/ AI3D

Create generative 3d objects with LCM + TripoSR with FAL on the Vision Pro

🥇1st Place $3500 or Apple Vision Pro

Ina Yosun Chang

Twitter source

3/ Visual Vibe

Generative AI asset prompting for DJs playing live shows

@therealCampbell

Twitter source

4/ BillFox

AI voice agent to automatically negotiate medical bills

@nelsondma, @DipamV, @andymarao

Twitter source

5/ Short-form video generation

AI videos with talking avatars and generative music to broadcast your brand

@Marg_Groisman

Twitter source

6/ Media Assistant

Combining Moondream and @elevenlabsio to auto-annotate videos

Aleix Conchillo Flaqué

Twitter source

7/ Wake me up

AI alarm clock (cloned Taylor Swift) that wakes you up with personalized conversations

🥈2nd place 4090 GPU or $2,000 cash

@fai_ne and @konrad_gnat

Twitter source

8/ Interlinked

Live auto-translation with image generation baked in.

Google Translate, but AI.

@aditya_advani, @jacobgreenway, and @snow_huo

Twitter source

9/ iPod (interactive podcast)

AI-enabled podcasts that let you pause the episode and start a conversation with the host.

🥉3rd place ($600 or a mini projector)

@ananthnayak2000

Twitter source

10/ Redactable AI

AI agent that sits on Zoom calls and automatically purges PII from the transcripts

Twitter source

11/ Financialbird

AI voice broker that you can call and get portfolio recommendations from

Divija Naredla

Twitter source

12/ UForm 4 Swift

Family of pocket-sized multimodal AI models (small enough to run on watches) re-written in ONNX and ported to Swift

Ashot Vardanian

Twitter source

13/ Voice Canvas

Real-time SDXL Image generation controlled by voice

@engineer_abel

Twitter source

14/ Masterpiece mouthpiece

Art museum exhibits explained by AI avatars

Twitter source

15/ FaceTime recorder

Dialogue transcriber for FaceTime calls

Twitter source

16/ Realtime video super-resolution

3x Upscaling live webcam footage up to 15 FPS on CPU (purely running on client) using ESPCN

Twitter source

17/ Podcast AI

Summarize podcasts into key points and sections

Twitter source

18/ [Input/Retrieval]

Hardware device for easily saving information (i.e. voice, calendar, email) and easily categorizing + retrieving it

@SarkaryShahvir

Twitter source

19/ Dreamweaver

AI dream journal psychotherapist to help you understand your dreams

Twitter source

20/ Realtime Sora

Rapidly generate AI images in seconds

Matthew Heartful and @JanakAgrawal2

Twitter source

21/ Chirpy Chow

AI call center for restaurants to place menu orders without paying exorbitant fees to food delivery apps

Ayush Khandelwal and @sebheyneman

Twitter source

Huge thanks to hackgoofer, @aiengfoundation for hosting + sponsors
@trydaily, @Oracle, @DeepgramAI, @fal_ai_data, @FireworksAI_HQ, Cloudflare

Visit Agen.cy for more agents, generative AI, and demos

--

--

Agen.cy

From the team behind AgentOps, Agency helps teams create reliable AI agents at scale.