AI Today and Everyday — April 1, 2024

Jason Caston

Published in

Let’s Learn AI — Lesson, News and Topics

6 min readApr 1, 2024

What’s New in the World of AI

OpenAI unveils AI voice cloning tool

Image source: 9to5Google

The Rundown: OpenAI has unveiled a preview of Voice Engine, a model that can clone human voices from a 15-second audio sample and generate natural-sounding speech.

The details:

The model is able to preserve the accent and emotions of the original speaker in generated speech.
Voice Engine is currently being tested by a small group of trusted partners, including AI startup HeyGen.
OpenAI has implemented safety measures like watermarking and proactive monitoring to prevent misuse.
The company revealed it first developed the tech in late 2022 and has been using it to power voices in its text-to-speech API and ChatGPT.

Why it matters: OpenAI is clearly far ahead in the space, with Voice Engine being deployed internally since 2022. However, with no public release in sight, the company seems to understand the risks, such as deepfake scams during an election year.

Apple’s iOS 18 AI strategy

Image source: Apple

The Rundown: Apple just announced that its annual Worldwide Developers Conference (WWDC) will kick off on June 10, with the company expected to reveal an AI strategy during the keynote presentation.

The details:

WWDC 2024 will take place in person at Apple’s Cupertino, CA, campus from June 10 to 14.
iOS 18 is rumored to be the most ambitious overhaul of the iPhone’s software yet.
Apple insiders believe iOS 18 will include a revamped version of Siri, AI integrated into iMessage, auto-generated playlists, an AI health coach for watches, and more.

Why it matters: Between new partnerships, research papers, and rumors, the anticipation for Apple’s consumer dive into the sector has been growing louder and louder. WWDC could mark the next defining moment in Apple’s history — and potentially put AI in the hands of billions of new users.

Create flowcharts with Claude 3

The Rundown: In this tutorial, you will learn how to create flowcharts in seconds for free with Claude 3, allowing you to communicate complex processes using a shared visual language.

Step-by-step:

Visit the Claude 3 website.
Enter the following prompt: “Create a flowchart for: [specific topic]”. Note: You can also attach a file or images.
Claude will provide you with the Mermaid code. Copy it.
Go to the following website (Mermaid Live Editor), remove the code on the left side to paste yours, and watch your code turn into a flowchart!

Adobe announces Gen-AI tool for on-brand marketing

Our Report: Adobe has announced a new content creation platform — GenStudio — which leverages generative AI to streamline content creation, allowing marketers to manage campaigns and measure performance while remaining on-brand.

Key Points:

GenStudio (first previewed in September) is an AI platform that creates on-brand assets for email, social media, and display ads (with support for website content coming soon).
Marketers can add copy guidance, assets, and brand guidelines to GenStudio and the system will continuously check to ensure all visual and text-based content is on-brand.
It also provides marketers with integrated analytics which give them insight into which attributes, assets and campaigns are performing best, and why.

Why you should care: Adobe built GenStudio in response to a survey that revealed marketers still feel wary about using GenAI tools for content creation because they’re concerned that AI could produce damaging, off-brand messaging.

Amazon fights OpenAI with $4B Anthropic funding

Our Report: Following its $1.25B investment in September, Amazon has invested a further $2.75B in Anthropic (co-founded by ex-OpenAI engineers), which has just released Claude 3, reportedly more powerful than OpenAI’s GPT-4, outperforming both on industry benchmark tests.

Key Points:

This brings Amazon’s Anthropic investment to $4B (their biggest-ever outside investment), and although they’ll maintain a minority stake, they won’t have a seat on the board.
Under the deal, Anthropic will use AWS as its primary cloud provider and Amazon’s chips for “mission-critical work” including safety research and future models.
Anthropic has closed five funding deals over the past year (including $500M from Google, with a commitment to increase to $2B) taking its total funding amount to $7.3B.

Why you should care: Amazon’s $4B investment in Anthropic will no doubt raise alarm bells with US antitrust regulators following investigations into big tech investments in start-ups (like Microsoft’s $10B investment in OpenAI) to ensure fair competition.

Is OpenAI finally about to pay GPT builders?

After promising to pay GPT builders by Q1, OpenAI has announced the ‘usage-based GPT earnings program’ that will allow (US) GPT developers to earn money, based on usage.
A small group of developers are testing the usage-based compensation model and OpenAI will “work with the developer community” to establish the best approach.
Although it’s promising news, OpenAI hasn’t disclosed which developers are testing it, what the terms of the revenue model are, or the timeline for its rollout.

Elon Musk revealed in an X post that Grok 2 is in training and will “exceed current AI on all metrics“.

Andrew Ng, founder of DeepLearning AI, spoke at Sequoia Capital’s AI Ascent meetup, revealing that GPT 3.5 outperforms GPT-4 through an agentic workflow.

OpenAI has quietly updated an article showing a ‘DALL-E 3 editor interface’ — allowing users to edit images directly in the chat by selecting an area of the image and prompting changes.

The National WWII Museum is using AI and voice recognition to allow visitors to “converse” with World War II-era Americans through volumetric video interviews.

Japan and the US are reported to announce closer cooperation in AI and semiconductors in a joint statement when Prime Minister Fumio Kishida meets with President Joe Biden next month.

Google DeepMind CEO Demis Hassabis has been awarded a knighthood in the UK for “services to artificial intelligence”.

Indiana Pacers used Snapchat AI filters to make it look like Los Angeles Lakers fans were crying during a recent NBA game.

Belgian researchers are using AI to improve beer taste, creating AI models that predict flavor profiles and analyze chemical compositions during the brewing process.

Elon Musk announced that all X/Twitter premium users will gain access to xAI’s Grok chatbot, which was previously gated to the higher premium + tier.

Airtable introduced new AI-powered summarization, categorization, and translation features to paying users.

The European Commission warns that deepfakes have started to appear in several EU member nations’ elections, pushing out guidelines on how platforms should protect against the AI risks.

Emod Mostaque posted a (now viral) selfie during a video call with Microsoft CEO Satya Nadella, just days after the former Stability AI CEO resigned from the startup.

Why Google failed to make GPT-3 + why Multimodal Agents are the path to AGI — with David Luan of Adept (64 minute read)

This article contains an interview with David Luan, one of OpenAI’s early hires, a past leader of Google’s LLM efforts and co-leader of Google Brain, and founder of Adept, one of the leading companies in the AI agents space, where he discusses his time with early OpenAI and how Adept is building agents that can do anything humans can do on a computer. Google had a huge lead with AI in 2017, but it was OpenAI that ended up making GPT 1/2/3. While Google’s team created Transformers, the company’s internal processes made it difficult for its researchers to get work done. OpenAI was able to beat Google because it took big swings and focused.