Videopoet: Revolutionizing Video Generation

Max Levko
Generative world
Published in
3 min readDec 27, 2023

Dear enthusiasts, we are excited to share some groundbreaking news with you. Google has recently announced the launch of Videopoet, a revolutionary video generation model that promises to transform the way we create video content. With its user-friendly approach, Videopoet aims to make video production more accessible to a wide range of users, particularly in the field of digital marketing where captivating video content plays a crucial role in engaging and retaining audiences.


Videopoet introduces cutting-edge capabilities that leverage generative models to generate videos using various inputs, such as text, images, and even video editing. In this blog post, we will explore the remarkable features of Videopoet and showcase some captivating examples that demonstrate its versatility.

Text-to-Video Generation

Videopoet’s text-to-video generation capabilities are truly remarkable. By providing a simple text prompt, Videopoet can generate visually stunning videos that bring the description to life. Let’s take a look at a couple of amazing examples:

  • Two teddy bears holding hands, walking down rainy 5th Avenue. Example Video
  • A squirrel in armor riding a goose, action shot. Example Video

For more captivating examples, check out the full Text-to-Video Gallery.

Image-to-Video Generation

Videopoet’s image-to-video generation capabilities take creativity to new heights. By simply providing an image, Videopoet can transform it into a dynamic and engaging video. Let’s take a look at an extraordinary example:

  • A ship navigating the rough seas with several passengers on board, thunderstorm and lightning, animated oil on canvas. Example Video

Explore more mesmerizing examples in the Image-to-Video Gallery.

Video Editing and Stylization

Videopoet’s video editing and stylization capabilities are truly impressive. With Videopoet, you can effortlessly edit videos, apply various styles, and create stunning visual effects. Check out the full Video Editing Gallery and Stylization Gallery for awe-inspiring examples.

Controllable Video Editing

One of the most remarkable features of Videopoet is its ability to edit subjects to follow different motions, such as dance styles. This level of control allows users to create personalized and captivating videos like never before.

Using Generative Models to Tell Visual Stories

To showcase the extraordinary capabilities of Videopoet, the team at Google produced a short movie composed of many clips generated by the model. The movie tells the story of a traveling raccoon, with each clip corresponding to a prompt provided by Bard. The resulting clips were then seamlessly stitched together to create the final masterpiece. Watch the captivating YouTube Short below.

Advanced Capabilities: Video-to-Audio

Videopoet goes beyond video generation and can also generate audio to match an input video without using any text as guidance. This innovative feature opens up new possibilities for creating immersive multimedia experiences.


Videopoet represents a significant leap forward in video generation technology. With its ease of use and remarkable capabilities, it empowers users to unleash their creativity and produce captivating videos effortlessly. To learn more about Videopoet and dive into the technical details, check out the full work description on arXiv.

Experience the future of video generation with Videopoet!

