When ML meets Product: April ’24 AI Product Updates
Keeping up to date with new AI models, products and events
Working in AI right now means, at least for me, to dedicate a lot of time keeping up to date with news on models, products, regulations, and more. In my case, I do this by reading a lot of newsletters, medium blog posts, and general news and resources. This enables me to be well informed about state-of-the-art models, AI industry trends and use cases, and shape all of this into product opportunities or potential new solutions to consider in the short / mid term.
This time, I’ve decided to summarize the most important recent news around GenAI products and share them. And this is it: my first AI Product Updates, which can turn into a monthly update if people find it valuable. In this update you’ll find:
- Text, image, video, music, and voice generation updates
- New AI use cases and trends in the industry
- Passed and future AI-product related events
Text generation updates
- OpenAI is closing deals with media companies (Prisa Media and ‘Le Monde’). The goal will be to allow users to interact with newspaper and other relevant media content through ChatGPT. Is the way we access news worldwide going to change sooner than expected?
- In Which AI should I use? Superpowers and the State of Play, Ethan Mollick compares the 3 latest biggest versions of the most important text generation models in the market (OpenAI’s GPT-4, Anthropic’s Claude 3 Opus, and Google’s Gemini Advanced). The 3 of them are pretty tie in many benchmarks and capabilities, but it is interesting how the difference seems to lie more in the tone or feeling they leave to the user.
Image & video generation updates
- Cool product #1: OpenAI allows image editing through DALL-E in ChatGPT (youtube demo). One of the biggest limitations I have experimented with image generation models is how hard it is to get certain details right through the prompt. This feature is a good move towards solving this pain and managing to generate more valuable images for the user.
- Cool product #2: Google researchers unveil ‘VLOGGER’, “generate lifelike videos of people speaking, gesturing and moving — from just a single still photo”. I’ve recently found myself recording demos or short videos and taking a lot of time to get it right, on time, and without too distracting gestures, so I can definitely see the value in a product like this (+ endless new possibilities when adding automatic translations to any language, and so on).
- Cool product #3: SORA, the video generation product from OpenAI, has recently been pitched to the creative community. Other than checking the impressive videos generated by creative directors and artists, it was really interesting to see their feedback. I found it particularly refreshing to read how the biggest perceived potential was to create surreal content and bring imagination to the limit. Maybe GenAI in general would bring more value and less risks to society if it moved towards that “create things that are completely new” direction, instead of going the “hyper-realistic, real people cloning” direction.
Voice generation updates
- Cool product #6: speechify “Cut Your Reading Time in Half. Let Speechify Read to You” (and with voices that resemble your favorite artists!)
- Cool product #7: OpenAI Voice Engine, which uses text input and a 15-second audio sample to generate new audio and even translate to other languages the person doesn’t speak.
I see a lot of potential with these products:
- From a media point of view: allowing creators to reach more users around the world through speech in each user’s mother tongue. Think about audiobooks, podcasts, youtube videos, or even this own blogpost read by a voice that really resembles mine.
- For any company thinking about expanding to the world or wanting to be finer and getting more personalized speech to the users. Think about being able to produce voice explanations or marketing campaigns in any language of the world. But also, personalize even further such as to any minority language or even different accents of a given country!
But… Generating speech that resembles someone’s voice can pose serious risks to society (fake news, fraud, misinformation…). Because of that OpenAI is not releasing, yet, their Voice Engine tool.
Music generation updates
- Cool product #4: Stability.ai introduces Stable Audio 2.0 a new AI-generation audio that produces music with a structure coherence up to 3 minutes length. It allows the generation of audio both from prompts and uploaded audios from the user (audio-to-audio).
- Cool product #5: Suno also introduced their version 3 product, which allows users to create full, two-minute songs in seconds. It even creates lyrics and adds voice to the songs. Check out the country song I created about the love story between Machine Learning and Product Management!
All these products are great from an amateur point of view wanting to create music for fun. It also feels like music industry is already being disrupted, at least for now, helping to create, explore, and produce parts of new songs faster.
But… Artists are worried about getting replaced by AI and not being compensated fairly for their work. You can read the letter +200 artists signed here.
Interesting AI use cases
The two sectors where I have been reading GenAI is revolutionizing the most are Customer Support and Marketing. To get an idea of what is already happening there in terms of AI usage and impact, here are some interesting use cases.
Customer Support
- Overview from BCG on the opportunity for GenAI to transform Customer Support centers.
- Some weeks ago, Klarna (fintech, buy now pay later provider) shared how they were handling two-thirds of their customer service chats with an AI assistant. In their communication they share impressive numbers on the performance and cost savings of the service.
- Microsoft is testing an animated character to provide customer support for Xbox.
Marketing
- In how media agencies are shifting toward generative AI content in influencer marketing, you can read an overview of the impact genAI is causing to marketing. An interesting insight is how users seem to prefer AI generated content due to its ability to hyper-personalize the communication to them.
- In the meantime, Google is already implementing GenAI into their marketing toolkit: automatic generation of keywords and headlines, new headlines to display based on the user’s keyword searched, campaign creation with descriptions and other relevant assets, image generation and edition, conversational ad help creation and more! It is interesting to see these moves to get an idea on how things related to content creation for marketing or general marketplaces might evolve in the close future.
- Amazon is also launching GenAI products such as image generation and edition, using the partner’s webpage to automatically produce Amazon product listings, and more!
Other trends in AI
AI agents
GenAI is moving to agentic workflows to increase their capabilities. The idea behind this is, for a given task you need to fulfill, to use different models concatenated to produce the most optimal results. This includes the ability for those models to use tools, such as web search to obtain recent information or code execution to run calculations. A nice explanation on how this works can be found in the batch newsletter.
A good example on where AI agents can take us in the digital product space is Devin, “the first AI software development”. Thanks to the concatenation of agents and generative AI models, Devin goes some steps beyond the capabilities of Github copilot (end to end apps development, fix bugs, train ML models…).
Product Strategy
Zoe Scaman introduces how she leverages AI in multiple use cases and scenarios in her strategy projects: Strategy in the era of AI.
Claire Vo has developed ChatPRD, a chatbot that helps PMs write product documentation, with problem statements, business goals, user goals, and suggests additional features like analytics.
UX relationship with AI
- Design Principles for Generative AI Applications. GenAI introduces to products a new interaction paradigm, variability in outputs, and new risks and potential harms. The post introduces strategies to tackle this from a design point of view, by designing: responsibly, for mental models, for appropriate trust and reliance, for variability, for co-creation and for imperfection.
- Shape of AI introduces several AI interaction patterns to help users: identify and distinguish AI features and content, understand how AI works and how to work with it, display techniques to use AI, help refining outputs, and assess accuracy.
🗓️ Event radar
Passed events with on-demand videos available:
- apply() spring 24, by tecton. Top speakers deep diving into topics like LLMs, RAG, Data Engineering, MLOPs, ethics and more!
- NVIDIA GTC took place during March 2024, and most sessions become available online from April 10th.
- I recently discovered the inspiring Women in Product Podcast, where they just started an AI Series. I particularly enjoyed the episodes “Breaking into AI Product Management” and “An executive’s perspective building AI products”. Looking forward to new episodes!
Future events to keep an eye to:
- 11th April (online) — Conf42 Large Language Models (LLMs). Interesting talks and tracks like: AI, APIs, business, chatbots, culture, observability, security or data. Online and for free!
- 16th April (online) — OECD-EU Online Workshop on Regulation of Artificial Intelligence. To explore challenges recent developments in AI are raising and identify areas where ex-ante regulatory interventions may be requires, deep diving into foundational models, data, and cloud computing.
- 17th April (Barcelona) — Data and Climate Modelling (Data & Sustainability, Episode 2) by BCN Analytics. Second session in the Data & Sustainability series from BCN Analytics, this time exploring topics like climate prediction, meteorology-related variables prediction and skilful weather prediction.
- 7th May (Barcelona) — Exploring strategies for Trustworthy AI by DataForGoodBCN. Second session in the Ethical AI field from DataForGoodBCN (where I actively participate as Board Member!), where we will learn from experts in the field about explainability methods and strategies to tackle responsible AI (detecting sources of uncertainty, human-AI collaboration, complying with regulations…).
Wrapping it up
That was it from When ML meets Product — April ‘24 AI Product Updates. Hope you enjoyed the read, I’ll be happy to hear your thoughts, questions, or suggestions.
More content about the intersection between Machine Learning and Product Management, coming soon!