Generative 3D Texturing using AI Depth Estimation and Substance

Mike Voropaev
6 min readJan 5, 2023

--

Generating AI wooden carved bas-relief with depth map

In this article I’m sharing my workflow of turning Midjourney bas-relief generations into 3D textures with correct height maps for displacement.

You can see more results of this workflow on my Behance page.

Prerequisites:

  1. Midjourney (or any other generative AI like DALL-E or Stable Diffusion).
  2. Stable Diffusion UI (Invoke AI and Automatic1111). You can use only AUTOMATIC1111, but I like Invoke AI UI more.

3. Substance Sampler — AI algorithm used for additional Normal, Roughness, and Ambient Occlusion data. Not a must for a depth map, but it improves the quality of the final result.

4. Substance Designer — used for additional details and combining depth maps. You can use photoshop instead, but you will miss a lot of procedural texturing tools.

5. Any photo editing software to create simple masks.

6. Good GPU / CPU if you want to run it locally (I’m using 3090 GPU). Depth estimation models eat a lot of VRAM. You can also find notebooks on Google Colab to run them in the cloud.

I won’t go through the installation process of those, since you can find a lot of information on YouTube.

Flowchart

To summarize my workflow, I created this little flowchart:

Workflow overview

Generating wood carving relief in Midjourney

Of course, you can use Stable Diffusion for the same purpose, but I find it easier to get some fast results in Midjourney, since their new V4 model is awesome.

For this style of image I used the following text prompt:

mid century wooden carved relief masterpiece showing the arrival of large plane-like two-engine spaceship on the planet with aliens, super detailed, intricate details, realistic mid century sculpture art, textures, wood cracks and paint, classic art, archeology, correct shapes, wooden panel, wooden carving, detailed, crude, by Hieronymus Bosch, by Paolo Uccello, Non-Euclidian, Paradox, in a symbolic and meaningful style

Midjourney result

As you can see, V3 image is cropped, so it makes sense to play around with prompt a bit more. But I will just fix it in Photoshop later.

Then I’m doing upscaling in Midjourney to get higher resolution of this image:

Upscaled Midjourney image

I still find some birds and other objects a bit weird and distorted, but we will fix it later.

Fixing image with Stable Diffusion (Invoke AI UI)

Invoke AI’s infinite canvas feature allows selective diffusion on the masked areas of the initial image:

Inpainting UI in Invoke AI

I recommend to use specialized inpainting models:

To fix this weird shape on the sneaker, I’m selecting the needed part and typing a simple prompt:

mid century wooden carved relief of a sneaker

I’m replacing other parts of the image the same way:

Inpainting with Stable Diffusion
Inpainting with Stable Diffusion

Don’t forget that you can find generated parts in Invoke AI folder on your machine. If you want a bit more consistency you can try the same seed for other inpaintings of the same objects:

Using seed

After all work is done, just save the merged canvas image.

Depth map generation with AUTOMATIC1111 UI

After that you need to make sure you have depth map script installed under “Extensions” tab:

Extensions in AUTOMATIC1111 UI

If you don’t, you can install in from “Available” tab:

Available addons for AUTOMATIC1111 UI

You can also find the GitHub repo here:

After installation, re-launch AUTOMATIC1111 UI and go to the “Depth” tab:

AUTOMATIC1111 depth generation

Here you can try different models for depth map generation. Feel free to experiment, but, personally, I’m using these three in combination:

“res101” (GPU, BOOST checked), gives good equalized result:

res101

“dpt_beit_large_512 (midas 3.1)” (GPU, BOOST unchecked), nice details:

dpt_beit_large_512 (midas 3.1)

“dpt_hybrid_384 (midas 3.0)” (GPU, BOOST checked), good in selecting main objects and accents, lacks detail:

dpt_hybrid_384 (midas 3.0)

Upscaling maps

After generating the main artwork and it’s depth maps, I need to upscale it to 8K (texture resolution I’m working with in this example). You can use upscaling in AUTOMATIC1111 UI, or use a simple Upscayl app that has most of the popular models, but lacks parameters and some flexibility:

Upscaling with Upscayl

You can find Upscayl repo and read more here:

Substance Sampler AI: Image to texture

To get some small details from the initial image, I’m using Substance Sampler. It can give some nice results with it’s image-to-texture algorithms, but usually I’m avoiding to create height maps here and prefer the ones I showed above.

Substance Sampler

I’m generating Normal, Roughness and AO maps with the Substance Sampler.

Substance Designer

Substance Designer is the final tool to combine all the generated stuff into a nice PBR set of texture.

I’m combining all depth maps into one to get a nice balanced result:

Combining depth maps

Also using a bit of blur to hide some artifacts.

Then I’m adding wood details and color to the initial image procedurally:

Substance Designer procedural texturing

This way I can get consistent results even with different styles and colors from Midjourney and also add some missing wood texture details.

PBR texture set

Final Render

Using this workflow, I was able to get some nice textures quickly that can be applied to a decimated plane with displacement, used in game development, or even 3D printed with woodfill filaments:

Final result
Wood imitating filaments for 3D printer

This method can be used in procedural texturing as well as sculpting. I can imagine creating some alpha maps for ZBrush, but have not tried it yet. So I see a great potential here.

Please support me with likes and following here:

https://www.instagram.com/mikevrpv/

Hope this little breakdown was helpful!

Let me know if you have any questions and stay curious!

--

--