AI Today and Everyday

Jason Caston

Published in

Let’s Learn AI — Lesson, News and Topics

5 min readApr 23, 2024

What’s new in the world of AI

Report: AI surpasses humans

Image source: Stanford University

The Rundown: Stanford University released its 2023 AI Index report that tracks worldwide trends in AI — revealing that AI has surpassed human-level performance on the majority of significant benchmarks.

The details:

AI now exceeds human performance on benchmarks for tasks like image classification, reading comprehension, visual reasoning, and more.
Many benchmarks have become obsolete due to AI’s acceleration, with researchers rushing to develop more tests to measure capabilities.
Closed-source models still lead (for now), with the AI industry still dominated by major players due to the increase in training costs.
LLMs are also becoming more truthful and less prone to “hallucinations”.

Why it matters: The most staggering part of this report is that it doesn’t even cover 2024, which has already seen major model advances like Claude 3, Llama 3, and more to come. As quickly as things are moving, each year is likely going to be magnitudes crazier — whether the world is ready or not.

Microsoft’s new AI model sees Mona Lisa perform rap

Our Report: Microsoft Research Asia has unveiled a new AI model — VASA-1 — that can generate realistic human faces that can speak, from a single picture and a speech audio clip, demonstrating the model’s capability by releasing footage of the Mona Lisa rapping Lady Ga Ga’s, “Paparazzi.”

Key Points:

VASA-1 takes an image of a person, pairs it with an audio file, and creates a video of their face, mimicking facial expressions and head motions, and synchronizing lip movements.
It was built using AI programs like OpenAI’s DALL-E-3, head movement generation models, and numerous video samples to create realistic facial expressions and movements.
Microsoft sees VASA-1 paving the way ”for real-time engagements with human-like avatars” to provide companionship and therapeutic support for those who need it.

Why you should care: Microsoft is “opposed to any behavior to create misleading or harmful contents of real persons” and, therefore, has no plans to release an online demo, API, or product, preferring to make sure the model “will be used responsibly and in accordance with proper regulations” before it launches.

Hollywood begins AI cloning

Image source: Xangle Apps

The Rundown: Leading Hollywood talent agency CAA has reportedly been testing an initiative called CAA Vault, allowing A-list clients to create AI clones of themselves to open new creative opportunities.

The details:

CAA partnered with AI firms to scan clients’ bodies, faces, and voices, creating AI replicas for uses like reshooting, dubbing, and stunt double superimposing.
CAA’s goal is to eventually make the tech available industry-wide, not just to its clients.
Hollywood has already been bracing for AI’s impact, with Tyler Perry even halting studio expansions after seeing OpenAI’s Sora video capabilities.

Why it matters: While the industry grapples with AI’s coming implications, CAA is taking proactive measures to help clients benefit from the shift. As models continue to improve, the difference between hiring Ryan Gosling or his AI replica may become imperceptible to the average fan.

Apple acquires yet another AI start-up

Following the acquisition of start-up — DarwinAI — Apple has now acquired Datakalab, a French-based startup that specializes in AI compression and computer vision technology.
Datakalab has previously built AI tools for Paris transportation systems to monitor mask compliance during the pandemic, underscoring its technical capabilities.
It’s expected that their expertise will enhance Apple’s upcoming iOS features and Vision Pro projects, potentially impacting areas like facial recognition in Photos and Face ID.

Quick AI Notes

Google secretly working on new Gemini features

A “Live Prompts” setting within Google’s Gemini app has been spotted, and while its functionality is unknown, it’s expected to give users a new way to execute tasks or prompts.
A “List of Live prompts” has also been uncovered, suggesting that Gemini users will also soon get the ability to create multiple automated actions or reminders through prompts.
This development means users could schedule a morning routine prompt for 7am, and have Gemini execute a chain of actions like summarizing news and turning on lights.

Apple is reportedly building an LLM that will be completely on-device instead of cloud-based, which could improve privacy and speed up response times for AI tools on next-gen devices.

Meta’s Llama 3 has ascended the AI model rankings, topping GPT-4 in English prompts on LMSYS LLM leaderboard in a milestone for open-source LLMs.

Hugging Face introduced Open Medical-LLM, a new benchmark for evaluating AI models in healthcare tasks.

TED Talks released a viral promotional video created by OpenAI’s unreleased Sora model for its recent TED2024 conference.

The Canadian government is setting aside $50M to retrain workers potentially displaced by AI as part of a broader $2.3B initiative to enhance AI adoption.

TikTok is reportedly developing a new AI-based text-to-speech feature that allows users to record and clone their voices in just 10 seconds for use in videos.

Blockade Labs released a new update to its Skybox AI technology, improving the generation of 3D art for 360-degree applications while also adding 8K resolution support.

Trending AI Tools

Open Agent Studio — A powerful no-code agent editor
DuckDuckGO AI Chat — Private AI chat, no AI training on your conversations
Healax — Mental health solution for students
Journey AI — Convert customer research into journey maps
Pietra Product Design Studio — Design a best-selling product using AI
SwitchLight — Switch lighting in photos within seconds

Browse more AI tools →