ChatGPT can now see, hear, and speak

Omer Khalid, PhD
4 min readOct 4, 2023

Welcome back readers. Last week we saw some incredible progress happening in the tech industry and here is quick recap:

  • ChatGPT launched some game-changing features
  • Microsoft’s continued integration of AI into its products and services
  • Amazon’s surprise jump into the Generative AI space by investing in Anthropic AI rather building it’s own LLMs
  • Google integrates Bard into it’s apps

🗞️ Top News of the Week

Here are top 5 highlights of the week in collaboration with Joseph Hardwicke.

ChatGPT can now see, hear, and speak

OpenAI is rolling out new voice and image capabilities in ChatGPT to allow more natural conversations and ability to discuss visual information.

  • Users can now have voice conversations with ChatGPT. This allows engaging in back-and-forth dialogue, requesting information on the go, and accessing other intuitive voice-based use cases. The voice capabilities use text-to-speech and speech recognition systems developed by OpenAI.
  • Users can also now show ChatGPT images and draw on them to discuss visual information. This enables use cases like troubleshooting objects, planning meals based on fridge contents, and analysing graphs. The image features are powered by multimodal AI models GPT-3.5 and GPT-

--

--