OpenAI Starts Rolling Out Its Her-Like Voice Mode for ChatGPT

Published in

Augmented AI

5 min readJul 31, 2024

Welcome back, everyone. Today, I wanna talk about some intriguing developments from OpenAI. They’ve just begun rolling out an advanced voice mode for ChatGPT, and it’s got some people buzzing — mostly because it’s reminiscent of a certain AI from a popular movie. You know the one. Spoiler alert: it’s not just a romantic comedy.

What’s New?

So, what exactly is this advanced voice mode? Well, OpenAI has introduced four preset voices: Juniper, Breeze, Cove, and Ember. Sounds like a lineup for a new indie band, right? But these voices are designed to provide a more natural and conversational experience. Gone are the days of robotic responses that sound like they were generated by a 1990s text-to-speech program. Instead, we’re getting something that feels a bit more alive.

OpenAI_Advanced_Voice_Mode — Oooh shiny new features!

This new feature is currently available to a select group of ChatGPT Plus users. It’s like an exclusive club, but instead of a secret handshake, you get to talk to an AI that sounds less like a computer and more like a human. And let’s be honest, who wouldn’t want that?

The Backstory

Now, let’s rewind a bit. The rollout was delayed due to some controversy involving Scarlett Johansson. Yes, the actress who played Black Widow and voiced an AI in the movie “Her.” She accused OpenAI of trying to mimic her voice without permission. OpenAI, in a classic “not us” move, denied any intentional resemblance. But they did pull the voice from their library, which is a pretty big deal. It’s like saying, “Oops, we didn’t mean to crash the party. We’ll just stand outside for a bit.”

Her Movie scarlet johansson OpenAI — You can be as happy as this guy with your new AI friend

This incident highlights a crucial point in AI development: ethical considerations. OpenAI has been under scrutiny for how they handle voice synthesis, and this controversy was a wake-up call. They’ve since implemented more robust safety measures and are now using professional voice actors for their presets. So, no more accidental celebrity impersonations.

The Technology Behind It

Now, let’s talk tech. The advanced voice mode utilizes OpenAI’s cutting-edge AI model to interpret multiple speakers and even sense emotional nuances in their tone. This is a significant leap forward. Imagine having a conversation where the AI can pick up on your mood. It’s like having a therapist who doesn’t charge by the hour.

OpenAI GPT4o event — Waited a millennium for this advance voice mode.

But here’s the kicker: this isn’t just about sounding human. It’s about creating a more interactive and responsive conversational experience. Remember that demo from May where users interrupted the AI mid-sentence, and it adapted seamlessly? That’s the level of engagement OpenAI is aiming for. It’s like a dance, but instead of two people, it’s you and an AI that’s trying not to step on your toes.

The Rollout Plan

As for the rollout, it’s happening in phases. The alpha version is available to a small group of users now, but OpenAI plans to expand access to all ChatGPT Plus users by fall 2024. So, if you’re not in the alpha group, just hang tight. It’s like waiting for the next season of your favorite show — except this time, you’re not just binge-watching; you’re actually participating in the plot.

Safety and Quality Measures

OpenAI has also ramped up its safety and quality measures. They’ve tested the voice capabilities with over 100 external experts, known as “red teamers,” across 45 languages. It’s like a reality show where the contestants are trying to break the AI instead of winning a million dollars. And let’s not forget the new content filters designed to block requests for copyrighted audio and harmful content. It’s all about keeping the experience safe and enjoyable.

The Future of Voice AI

So, what does this mean for the future of voice AI? Well, it’s clear that OpenAI is taking a more cautious approach. They’re prioritizing responsible AI development while still pushing the boundaries of what’s possible. This is a delicate balance, and it’s one that many companies are struggling to achieve.

But here’s the problem: as voice AI becomes more sophisticated, the potential for misuse also increases. We’re entering an era where impersonation and misinformation could become rampant. It’s like giving a toddler a paintbrush and hoping they don’t turn your living room into a Jackson Pollock.

The Solution

This is where Augmented AI University comes in. It’s a program designed to teach practical, innovative, and cutting-edge AI technologies, including Generative AI, Large Language Models, RAG, Computer Vision, and Robotics. By equipping individuals with the right knowledge and skills, we can tackle the challenges posed by advanced AI technologies.

So, if you’re interested in being part of the solution rather than the problem, consider checking out Augmented AI University. It’s time to get ahead of the curve and ensure that as we advance in AI, we do so responsibly.

Thanks for tuning in, and remember: with great power comes great responsibility. And maybe a few ethical dilemmas along the way. Until next time, keep questioning and keep learning. Enroll in Augmented AI University Today!