When ML meets Product: April ’24 AI Product Updates

Keeping up to date with new AI models, products and events

Anna Via
Bootcamp
8 min readApr 9, 2024

--

April ’24 AI Product Updates, Image by author

Working in AI right now means, at least for me, to dedicate a lot of time keeping up to date with news on models, products, regulations, and more. In my case, I do this by reading a lot of newsletters, medium blog posts, and general news and resources. This enables me to be well informed about state-of-the-art models, AI industry trends and use cases, and shape all of this into product opportunities or potential new solutions to consider in the short / mid term.

This time, I’ve decided to summarize the most important recent news around GenAI products and share them. And this is it: my first AI Product Updates, which can turn into a monthly update if people find it valuable. In this update you’ll find:

  • Text, image, video, music, and voice generation updates
  • New AI use cases and trends in the industry
  • Passed and future AI-product related events

Text generation updates

  • OpenAI is closing deals with media companies (Prisa Media and ‘Le Monde’). The goal will be to allow users to interact with newspaper and other relevant media content through ChatGPT. Is the way we access news worldwide going to change sooner than expected?
  • In Which AI should I use? Superpowers and the State of Play, Ethan Mollick compares the 3 latest biggest versions of the most important text generation models in the market (OpenAI’s GPT-4, Anthropic’s Claude 3 Opus, and Google’s Gemini Advanced). The 3 of them are pretty tie in many benchmarks and capabilities, but it is interesting how the difference seems to lie more in the tone or feeling they leave to the user.

Image & video generation updates

  • Cool product #1: OpenAI allows image editing through DALL-E in ChatGPT (youtube demo). One of the biggest limitations I have experimented with image generation models is how hard it is to get certain details right through the prompt. This feature is a good move towards solving this pain and managing to generate more valuable images for the user.
  • Cool product #2: Google researchers unveil ‘VLOGGER’, “generate lifelike videos of people speaking, gesturing and moving — from just a single still photo”. I’ve recently found myself recording demos or short videos and taking a lot of time to get it right, on time, and without too distracting gestures, so I can definitely see the value in a product like this (+ endless new possibilities when adding automatic translations to any language, and so on).
VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis
  • Cool product #3: SORA, the video generation product from OpenAI, has recently been pitched to the creative community. Other than checking the impressive videos generated by creative directors and artists, it was really interesting to see their feedback. I found it particularly refreshing to read how the biggest perceived potential was to create surreal content and bring imagination to the limit. Maybe GenAI in general would bring more value and less risks to society if it moved towards that “create things that are completely new” direction, instead of going the “hyper-realistic, real people cloning” direction.

Voice generation updates

  • Cool product #6: speechify “Cut Your Reading Time in Half. Let Speechify Read to You” (and with voices that resemble your favorite artists!)
  • Cool product #7: OpenAI Voice Engine, which uses text input and a 15-second audio sample to generate new audio and even translate to other languages the person doesn’t speak.
Examples on audio generation and automatic translations wit

I see a lot of potential with these products:

  • From a media point of view: allowing creators to reach more users around the world through speech in each user’s mother tongue. Think about audiobooks, podcasts, youtube videos, or even this own blogpost read by a voice that really resembles mine.
  • For any company thinking about expanding to the world or wanting to be finer and getting more personalized speech to the users. Think about being able to produce voice explanations or marketing campaigns in any language of the world. But also, personalize even further such as to any minority language or even different accents of a given country!

But… Generating speech that resembles someone’s voice can pose serious risks to society (fake news, fraud, misinformation…). Because of that OpenAI is not releasing, yet, their Voice Engine tool.

Music generation updates

  • Cool product #4: Stability.ai introduces Stable Audio 2.0 a new AI-generation audio that produces music with a structure coherence up to 3 minutes length. It allows the generation of audio both from prompts and uploaded audios from the user (audio-to-audio).
  • Cool product #5: Suno also introduced their version 3 product, which allows users to create full, two-minute songs in seconds. It even creates lyrics and adds voice to the songs. Check out the country song I created about the love story between Machine Learning and Product Management!
Suno’s country song “Love Among the Algorithms” based on my prompt

All these products are great from an amateur point of view wanting to create music for fun. It also feels like music industry is already being disrupted, at least for now, helping to create, explore, and produce parts of new songs faster.

But… Artists are worried about getting replaced by AI and not being compensated fairly for their work. You can read the letter +200 artists signed here.

Interesting AI use cases

The two sectors where I have been reading GenAI is revolutionizing the most are Customer Support and Marketing. To get an idea of what is already happening there in terms of AI usage and impact, here are some interesting use cases.

Customer Support

Marketing

Other trends in AI

AI agents

GenAI is moving to agentic workflows to increase their capabilities. The idea behind this is, for a given task you need to fulfill, to use different models concatenated to produce the most optimal results. This includes the ability for those models to use tools, such as web search to obtain recent information or code execution to run calculations. A nice explanation on how this works can be found in the batch newsletter.

A good example on where AI agents can take us in the digital product space is Devin, “the first AI software development”. Thanks to the concatenation of agents and generative AI models, Devin goes some steps beyond the capabilities of Github copilot (end to end apps development, fix bugs, train ML models…).

Product Strategy

Zoe Scaman introduces how she leverages AI in multiple use cases and scenarios in her strategy projects: Strategy in the era of AI.

Claire Vo has developed ChatPRD, a chatbot that helps PMs write product documentation, with problem statements, business goals, user goals, and suggests additional features like analytics.

UX relationship with AI

  • Design Principles for Generative AI Applications. GenAI introduces to products a new interaction paradigm, variability in outputs, and new risks and potential harms. The post introduces strategies to tackle this from a design point of view, by designing: responsibly, for mental models, for appropriate trust and reliance, for variability, for co-creation and for imperfection.
  • Shape of AI introduces several AI interaction patterns to help users: identify and distinguish AI features and content, understand how AI works and how to work with it, display techniques to use AI, help refining outputs, and assess accuracy.

🗓️ Event radar

Passed events with on-demand videos available:

  • apply() spring 24, by tecton. Top speakers deep diving into topics like LLMs, RAG, Data Engineering, MLOPs, ethics and more!
apply() spring 24 event
  • NVIDIA GTC took place during March 2024, and most sessions become available online from April 10th.
NVIDIA GTC 2024
Women in Product Podcast

Future events to keep an eye to:

  • 11th April (online) — Conf42 Large Language Models (LLMs). Interesting talks and tracks like: AI, APIs, business, chatbots, culture, observability, security or data. Online and for free!
Conf42 LLMs
OECD’S Regulation of AI
Data and Climate Modelling — BCN Analytics
  • 7th May (Barcelona) — Exploring strategies for Trustworthy AI by DataForGoodBCN. Second session in the Ethical AI field from DataForGoodBCN (where I actively participate as Board Member!), where we will learn from experts in the field about explainability methods and strategies to tackle responsible AI (detecting sources of uncertainty, human-AI collaboration, complying with regulations…).
Exploring strategies for Trustworthy AI — DataForGoodBCN

Wrapping it up

That was it from When ML meets Product — April ‘24 AI Product Updates. Hope you enjoyed the read, I’ll be happy to hear your thoughts, questions, or suggestions.

More content about the intersection between Machine Learning and Product Management, coming soon!

--

--

Anna Via
Bootcamp

Machine Learning Product Manager @ Adevinta | Board Member @ DataForGoodBcn