Market Mapping Consumer Gen AI for Image and Video (Or, Why I’m Quitting Business School to Become an Influencer)

Elizabeth Peng
Charge VC
Published in
5 min readFeb 9, 2024

tl;dr: This article lays out a market map of consumer Gen AI tools for visual content creation and 4 opportunities for builders and investors.

“In the next 10 years, 90% of media will be synthetically created,” according to Greycroft Partner Ian Sigalow. We’re well on our way, with visual content creation tools appearing across text-to-image/text-to-video, image/video editing, and 3D modeling, among other use cases:

Market map of consumer Gen AI tools for image and video

This first wave of products focused on refining image and video models and defining general use cases, like “image enhancer” or “text-to-video” (see a16z Partner Justine Moore’s article “Why 2023 Was AI Video’s Breakout Year”). As the tech matures and more people test out AI creative tools, the next wave should bring multi-modal and purpose-built applications that address specific user needs.

Here are four big opportunities to look out for in 2024:

Opportunity #1: Multimodal platforms

Consider this short made by Cameron Sim, which used 6 different applications to generate a (pretty impressive) 2-minute clip. Currently, text-to-image/video/speech, 3D modeling, animation, and editing tools are relatively fragmented.

A truly multimodal product could combine all of these tools into one platform, which should meet simple consumer and enterprise demands, even if studio filmmakers keep using best-in-class point solutions (like Runway for filming the rock scene in “Everything Everywhere All At Once”). For example, Synthesia, which raised a $90 million Series C last year, combines virtual avatar, text-to-speech, and other tools to help enterprises create marketing and training videos.

It doesn’t look like there is a winner yet for a consumer-friendly, multimodal, generative video platform (think “Adobe Photoshop for video”, which could be used for personal projects, school assignments, et cetera), so if you’re building it, let us know.

Opportunity #2: Vertical solutions

Anyone who has ever written, drawn, or produced content from scratch has probably experienced blank page syndrome (aka being overwhelmed by too many possibilities).

Image of the author while writing this article.

Popular text-to-image tools, especially Midjourney, blew up because they enabled users to generate endless creative outputs, but consumer AI 2.0 will shift toward purpose-built tools that solve specific problems. Examples of vertical solutions include Interior AI for interior design, Yodayo for anime character design, and Rosebud AI for game design.

Ideally, foundational models for these tools will be smaller and trained on domain-specific datasets so that they can generate good outputs for a particular use case. In contrast, many current models use large, non-specific training sets, which leads to average outputs.

Imagine an art generator that incorporates the golden ratio, or an interior design tool that knows you need a 6’x9’ rug for a queen-sized bed (has everyone Googled this, or just me?).

In addition to generating better outputs, vertical products are more differentiated in the market (compared to generic “image generators”) and more easily discovered by users (who can search “gen AI tool for [task]”). Vertical products can also be more consumer-friendly and easy-to-use, because feature sets can be streamlined and purpose-built (compare this to image generators today with endless and hard-to-discover features).

Opportunity #3: Prosumer revolution

With generative AI, anyone can be an influencer. Social media content creation apps dominate the video editing category (e.g., Detail, CapCut, OpusClip for video editing and viral short clip creation), and Descript, a video and podcast publishing suite, is one of the highest-valued consumer GenAI video platforms, with a post-valuation of $500 million after raising its $50 million Series C. These tools make it easier than ever for someone to create social media content across a variety of channels, including images, long- and short-form videos, and podcasts.

For content creators who want to maintain a consistent brand across a variety of social media platforms, a “dream” app might unify aesthetics, tone, and messaging across image, video, and sound (podcasts). The market is meeting the demand — there are more than 50 million content creators around the world today, and Goldman Sachs predicts that the market size for the creator economy is expected to double (!) from $250 billion today to $480 billion by 2027.

(It remains to be seen whether these new tools will actually spur on the growth of the creator economy by lowering barriers to entry. Fun fact: more than 50% of Gen Z would become influencers if they could.)

Opportunity #4: Extension of self

Companies like Heygen and Replicate AI are creating realistic virtual avatars with movement and speech for enterprise and prosumer use cases (for example, recording an HR training video without using an actor, or publishing video content at scale using an influencer’s likeness).

The ability to record a video without using a live actor has deep implications for media consumption and creation, but it also brings us closer to creating embodied digital extensions of ourselves (i.e., in the metaverse).

I’ve written about Snack, the dating app that trains a bot to date for you; in the future, video generation tools could power highly realistic digital likenesses that enable people to interact in myriad digital spaces, from dates to meetings. The intersection of generative AI and the metaverse is nascent, but is an area I’ll be following.

2024 should be a big year for the application layer of gen AI, and especially for image and video, given all of the progress made on foundational models last year. We’re excited to see the emergence of more vertically-driven, purpose-built apps that help solve everyday problems.

If you’re working on a consumer app for image or video and I’ve missed you in this article (or you’d like to chat), please get in touch! I’m on Twitter (@eliza_berna) and at elizabeth@charge.vc.

This post was written by Elizabeth Peng, MBA Candidate & Venture Fellow at Columbia Business School, and edited by Brett Martin, investor at Charge.vc. Elizabeth is a native Californian, fitness/wellness junkie, and professional contrarian, having started her investing career at hedge fund Elliott Management, where she served as board observer to Coveo (an enterprise AI search company), Gigamon, and Wrike.

--

--