Generate your own personal metaverse in seconds
Plus: how to turn an idea into a 2D animation in less than one minute.
Welcome to The Cusp: cutting-edge AI news (and its implications) explained in simple English.
In this week’s issue:
- Using AI to generate stylized 3d metaverse environments in seconds.
- How to turn an idea into a character, and then animate that character, for free, in less than a minute.
- A glimpse into the future: superimposing onto augmented reality with AI.
Let’s dive in.
Stylized 3D Environments with AI
AI for CGI has improved immensely in a very short time.
We went from hazy point-clouds, to basic 3D shapes, to photo-realistic objects in a few years. And now we can generate entire environments with AI.
Katsuaki Sato, founder of Space Data Co, created a whole city just a few days ago:
The city — a replica of Manhattan — was made in minutes from a series of satellite photos.
Then, Katsuaki combined the 3D cityscape with Stable Diffusion to turn the layout from photorealistic to, in his words, “a manga style.”
How can we take advantage of it?
In just the last few weeks, hobbyists have generated 3D vehicles, buildings, and even people using Stable Diffusion text prompts. Soon (< 3 years), we’ll be generating 3D worlds on the fly with nothing more than a text prompt.
The specific model Katsuaki created is closed-source. But open-source alternatives exist, and they’re catching up quickly.
- Pick your favorite video game (must be moddable or have openly accessible 3D assets — alternatively, you could use a library like Quixel Megascans).
- Fine-tune a set of Stable Diffusion models on an asset type; think potions, swords, boxes, vehicles, environments, etc.
- Generate a near-infinite variety of high-quality models for games and movies.
Logically, you might then create a service that sells procedural asset generation as a premise. You could also partner with a larger, more entrenched company to optimize their asset creation and leverage pre-existing cashflows.
But I’ve talked about that sort of thing before, and it’s getting a bit trite. These are billion-dollar opportunities, to be sure — but what about something less lucrative (and potentially more interesting)?
- Start your own metaverse studio offering “digital experiences.”
- Leverage aforementioned environment models to quickly create 3D versions of popular locations, and then stylize them with Stable Diffusion.
- Create cartoon environments, film noir environments, “Minecraft”-style environments, and so on, leveraging 3D asset creation for a massive leg up on the competition.
- Let anybody step into a virtual world of their choosing, and sell this either standalone or to advertisers looking to capitalize on the next marketing platform.
I see 3D as the logical next step for advertisers. Advertising started with text, graduated to imagery, ascended to videos — the only thing left on the engagement ladder is interactive, 3D experiences.
With consumer VR hardware around the corner and the barrier to entry lowered greatly by technologies like Stable Diffusion, it’s only a matter of time.
Turn ideas into animated characters in seconds
In addition to environment creation, another method picking up steam is automated character design and animation.
Using models like Sketch in conjunction with Stable Diffusion, you can now generate high-quality humanoid characters, automatically identify their joints, and then use biomechanics to guesstimate their movements in 2D.
The results are often accurate, realistic, and fast (generations take just a few seconds).
How can we take advantage of this?
Animate stuff and sell it! More specifically:
- Children’s books,
- Comic books,
- Light novels,
- DeviantArt or ArtStation drawings,
- Stable diffusion generations,
What does this mean for commercial artists?
That last bullet point begs the following question:
If you can both generate the imagery, and then animate that imagery… what’s left for commercial artists & animators?
Here’s a well-intentioned reply that I see circulating often:
Well, technology like this will enable new careers to accommodate them. They’ll learn better skills and subsequently improve both their financial situation and the economy.
And here’s the sober, more realistic answer:
All commercial artists will lose their jobs. There will be no replacement career. Anything a human can draw, design, or animate, AI will do orders of magnitude better and cheaper.
If you are in the aforementioned industries, consider this a wakeup call. Commercial art will not go the way of the industrial revolution — there won’t be more economic opportunity for people in your profession, only less.
The power loom of the 1850s had to be manually operated, which ultimately led to an increase in economic opportunity for humans. AI-powered media creation does not.
Within a few years, AI will be able to both create and judge the quality of its creations, and commercial artists will be relegated to the same economic niche as ‘artisanal clay pot makers.’
A glimpse at the future of augmented reality
Bjørn Karmann recently created a neat proof-of-concept that sits at the intersection of several explosive technologies: Stable Diffusion, augmented reality, and speech-to-text. He calls it “We See”:
In short, it enables real-time selective editing of your environment. And it’s a small precursor to what’s coming.
How can we take advantage of this?
Right now, augmented reality hardware is bulky, expensive, and buggy. But soon (think <10 years), it’ll be convenient, affordable, and seamless.
What economic opportunities will be unveiled when everyone sees the world through a digital lens?
- Selective editing of the kind Bjørn proofed will probably extend to every aspect of your environment. You’ll be able to stylize your input with different colors, animations, caricatures, and so on.
- Ever blocked someone on Twitter? Try blocking them in real life, or heavenbanning them. The degree of political polarization and fragility of our collective psyche will probably increase, which will no doubt have downstream societal impacts.
- The granularity of ad targeting will explode. Companies will be able to cater to specific users based on their eye movements, microexpressions, elevations in facial blood flow, and more.
That last point is important. Regulators, lovely as they may be, are usually older and more out of touch with the exponential rate of technological progression.
Even a few years of unfettered access to facial metadata will be enough to catapult companies into the stratosphere. The first-mover advantage will be very real indeed.
That’s a wrap!
Enjoyed this? Consider sharing with someone you know. And if you’re reading this because someone you know sent you this, get the next newsletter by signing up here.
See you next week.