Volograms Meet Generative AI: Endless Creative Potential
Generative AI disrupted text, then images, now video, and 3D is next!
This post was first shared with Volograms Newsletter subscribers, make sure you subscribe here if you want to be among the first to receive our updates.
At Volograms we are working on creative tools that will power 3D user-generated content, specialising in getting the shape and appearance of real people in 3D in the most accessible and affordable way. This is because if you think of the media we typically consume on almost any platform (from apps such as Instagram and TikTok to content platforms like Youtube, to even pro media like TV and films!), it almost always features real people!
With Volograms technology you can easily transform a photo of a person into a 3D model. No need to ask the person to spin around, to wear specific clothes or anything like that: just one photo. In fact, if you record a full video of that same person while they are talking, performing, dancing… we can generate a volumetric video sequence, recorded from a single view-point. It even figures out how your back looks. It’s pretty great!
Text to 3D texture
We use a full volumetric capture pipeline for this, which includes multiple AI approaches, different neural networks, and many Computer Vision algorithms to bring the person “as is” from 2D to 3D. But, what if you could take this further? For example, could you use the popular text-to-image models to change the appearance of your 3D model through its textures?
To test it, we built a prototype based on Stable Diffusion that takes a text prompt as input, and generates the full texture of a 3D model generated with Volograms AI technology. The interesting part is that we can preserve the face of the subject (in fact, any of our semantic labels) and ask the generative model to start from there. Check out the examples below 👇
I’m pretty excited about the possibilities! Imagine using this to try different outfits, generate unlimited variations for game characters in a scalable way, and many more. There is still some work to do to make sure the quality of the resulting texture is high and consistent throughout the whole 3D model, so we will keep you updated with our progress.
Text to 3D human
You are probably wondering if our technology works with any image of a person. The answer is yes! Of course, there are limitations in terms of what is visible in the image, but there are many possibilities. For example, we have used our technology to turn photos of athletes that could be used in broadcast studios for Fox Sports and RTÉ (check it out here), and it worked with photos and videos the broadcasters already have. So, could we use a text-to-image model to generate a photo of a person, and then turn that person into a 3D model? Let’s give it a try!
I recently got invited to Adobe Firefly, so I headed there and typed:
A full body photorealistic photograph of a fashion model wearing jeans and
sneakers with bright lighting
I know, I’m not the best prompt engineer 😅. Then I took a couple of the results and put them through our 3D reconstruction pipeline, et voilá! A full 3D model that can easily be integrated into all kinds of 3D environments.
Including Augmented Reality, of course!
What do you think? You can now create 3D humans just by writing a text prompt! Be careful with your new superpower!
Lots more coming
We are working on even more generative AI features so make sure you follow us to get the latest updates! We want to follow a more open development process, so we are working on a few updates that will show how our technology works behind the scenes. We are looking forward to sharing them with you!