GPT-4o is out! And it’s incredible. Audio & video.

Paul Pallaghy, PhD
2 min readMay 14, 2024

Today OpenAI announced a new model GPT-4o (pronounced ‘oh’).

It’s releasing over a few weeks.

Right now (Tues 14th 2024) it’s only updated with faster and cheaper text and image responses.

Next will come audio in, audio out and video in.

On top of all that it’s accessible by everyone for free in ChatGPT (PLUS subscribers get 5x more use per day).

But the demos!

Unbelievable. Seriously.

The voice in / out is highly impressive with near zero latency. You can now interrupt GPT whenever you want. And the voice responses are near instant, there’s no 2–3 second delay.

But most of all the voice responses are rendered with appropriate inflection.

It’ll respond in a marketing voice, or bed-time story mode. Or even sing it to you. Whatever you ask it to do.

You can chat to it while it’s watching live video from your phone or screen.

This is game changing-ly impressive. And the API is now faster and half price.

It makes a difference when something is easy to use. Just voice chatting to GPT will be so easy, we’ll do it at the drop of a hat.

This now explains why Apple has given up on development of their own LLM for Siri and are partnering with OpenAI.

Anybody can now power an android with this technology. But it’s day to day uses and modes of use just tripled.

--

--

Paul Pallaghy, PhD

PhD Physicist / AI engineer / Biophysicist / Futurist into global good, AI, startups, EVs, green tech, space, biomed | Founder Pretzel Technologies Melbourne AU