Exploring combination of 3D and generative AI for designer. (feat. AI mockup generator)

How generative AI can be used as a renderer of 3D model.

Min Park
Bootcamp
5 min readJan 18, 2024

--

I decided to write an article inspired from Using AI for 3D rendering — a practical guide for designers written by Antoine Vidal, as I was exploring similar things at that time. Hope this can be helpful for designers, artists who want to know more on the usage of generative AI. I got slight correction helps of GPT.

Preview of generative AI 3D mockup package box generator
Demo of mockup generator that I built

For those who want to try the prototype first, https://mockup-generator-chi.vercel.app/ try here. It might takes few minutes at first when the model is not loaded, but please wait then you’ll get result!

1. Embracing 3D and AI for Creative Freedom

As a software developer and creative enthusiast, I’ve always been fascinated by the limitless potential of 3D design. The ability to manipulate light and camera angles offers a playground for imagination, something often constrained in the 2D realm. However, getting a high quality 3D render was daunting, especially for beginners like me. The quest for the right Physical Based Rendering (PBR) materials, the variance in results across different renderers and lighting conditions, and the plethora of software options (Blender, Keyshot, Rhino, 3DS Max, Cinema 4D, etc.) create a steep learning curve.

This complexity led me to explore generative AI, particularly midjourney that deliver high-quality, realistic images. I discovered that the core purpose of generative AI and 3D — creating visually stunning, non-existent imagery — were aligned. This epiphany led me to wonder: Why not generative AI become an alternative to traditional 3D rendering?

At that time, my friend was searching all the internet to get a box package mockup for his own design but he wasn’t satisfied with the images and was complaining spending too much time on it. I got an idea from there and built a small prototype which can generate his own taste of box mockup image by combining 3D and generative AI.

2. Building the Mockup Generator: A Blend of 3D and AI

This is the simple prototype which you can try. The process is straightforward.

Create your own box (feat. 3D)

You can customize in size, rotation, and zoom on your own. This is pretty much like setting your own scene.

Simulating AI 3D mockup package box generator
Create your own box size

AI-Generated Render

With a click on ‘generate,’ the Stable Diffusion v1.5 model, enhanced with ControlNet DepthMap feature, renders a mockup with a simple white background, capturing the box’s shape perfectly.

The very base Stable Diffusion v1.5 model was fine for me, because it only requires a simple paper box and white background. The most important part here was to keep the box shape as it is. For this, I used ControlNet, especially DepthMap.

ControlNet is a model in AI image generation that guides to focus on specific details like shape, line, size, or position. It is mainly used together with StableDiffusion. It's particularly useful to control the detailed composition when image being generated. When used with Stable Diffusion, you can use specific type of model such as ‘canny’ for line, and ‘depth’ for more detailed shape and line.
Depth Map is an image that shows how far away each part of a scene is. Closer objects look lighter and farther objects appear darker, helps to create realistic based on 3D perception. This article explains the concept well.

At first, I tried to use ControlNet with Canny, which generates image based on only line. But the generated result was bad with not clear box shape. So I tried ControlNet with DepthMap, and I got much better result. Though the result image could be improved including lighting, colors, I could see the possibility. I guess the output quality can be improved with training with few high quality images.

Generated image of generative AI 3D mockup package box generator
Generated results

3. The Technical Backbone

For people who might want to know technical behinds, I will explain the infrastructure briefly.

UI(frontend)

The interface, built with Three.js, ReactFiber on top of NextJS for real-time 3D manipulation directly in the web browser. Once finished the size, camera position, you can click render button. Then, it captures the snapshot as a png image of the 3D scene, sends it to the AI API to generate a mockup image based on this.

*This is part of frontend code using react fiber.

export default function Scene({ dimensions, canvasRef }) {
const cameraCenter = [0, 0, 0];
const ref = useRef();
return (
<div className="flex flex-col w-full gap-4">
<div
ref={ref}
className="border box-border border-black"
style={{
width: "100%",
aspectRatio: 1,
maxWidth: "512px",
maxHeight: "512px",
}}
>
<Canvas
ref={canvasRef}
shadows
gl={{ preserveDrawingBuffer: true }}
>
<OrthographicCamera
makeDefault
position={[5, 3, 5]}
zoom={70}
onUpdate={(self) => self.lookAt(...cameraCenter)}
/>
<Environment files="/keyshot.hdr" />
<OrbitControls target={cameraCenter} />
<BoxMesh dimensions={dimensions} />
<Suspense fallback={<Loader />}></Suspense>
</Canvas>
</div>
</div>
);
}

API(backend)

I used Replicate which enables to use pre-built AI models by only API quick and easy way. If you need AI model which is famous that everyone knows, you can easily find the api from Replicate which you can use it directly with minimal setup with just an API key and a few blocks of code. I used this specific model.

*This is part of backend API code requesting replicate API.

await fetch("https://api.replicate.com/v1/predictions", {
method: "POST",
headers: {
Authorization: `Token ${process.env.REPLICATE_API_TOKEN}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
version:
"8ebda4c70b3ea2a2bf86e44595afb562a2cdf85525c620f1671a78113c9f325b",
input: JSON.stringify({
image: base64data,
prompt: "A white paper box, white background, 4k, uhd, mockup design, photorealistic, 3d render, minimal",
model_type: "depth",
num_samples: "4",
n_prompt:
"text, sticker, image, longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality",
},
}),
})

Shift from search to generate

The true power of Generative AI lies in its ability to create, instead of search. It marks a shift from searching stock images and online repositories for concept or brand imagery to generating unique, personalized visuals. We need ton of things to improve but the possibilities are endless. As I delve deeper into this field, I am excited to explore and share the evolving landscape of AI in design and creativity.

More to read

Feel free to leave reply if you have any question, I am open to collaboration!

--

--