Generative 3D Texturing using AI Depth Estimation and Substance
Generating AI wooden carved bas-relief with depth map
In this article I’m sharing my workflow of turning Midjourney bas-relief generations into 3D textures with correct height maps for displacement.
You can see more results of this workflow on my Behance page.
Prerequisites:
- Midjourney (or any other generative AI like DALL-E or Stable Diffusion).
- Stable Diffusion UI (Invoke AI and Automatic1111). You can use only AUTOMATIC1111, but I like Invoke AI UI more.
3. Substance Sampler — AI algorithm used for additional Normal, Roughness, and Ambient Occlusion data. Not a must for a depth map, but it improves the quality of the final result.
4. Substance Designer — used for additional details and combining depth maps. You can use photoshop instead, but you will miss a lot of procedural texturing tools.
5. Any photo editing software to create simple masks.
6. Good GPU / CPU if you want to run it locally (I’m using 3090 GPU). Depth estimation models eat a lot of VRAM. You can also find notebooks on Google Colab to run them in the cloud.
I won’t go through the installation process of those, since you can find a lot of information on YouTube.
Flowchart
To summarize my workflow, I created this little flowchart:
Generating wood carving relief in Midjourney
Of course, you can use Stable Diffusion for the same purpose, but I find it easier to get some fast results in Midjourney, since their new V4 model is awesome.
For this style of image I used the following text prompt:
mid century wooden carved relief masterpiece showing the arrival of large plane-like two-engine spaceship on the planet with aliens, super detailed, intricate details, realistic mid century sculpture art, textures, wood cracks and paint, classic art, archeology, correct shapes, wooden panel, wooden carving, detailed, crude, by Hieronymus Bosch, by Paolo Uccello, Non-Euclidian, Paradox, in a symbolic and meaningful style
As you can see, V3 image is cropped, so it makes sense to play around with prompt a bit more. But I will just fix it in Photoshop later.
Then I’m doing upscaling in Midjourney to get higher resolution of this image:
I still find some birds and other objects a bit weird and distorted, but we will fix it later.
Fixing image with Stable Diffusion (Invoke AI UI)
Invoke AI’s infinite canvas feature allows selective diffusion on the masked areas of the initial image:
I recommend to use specialized inpainting models:
To fix this weird shape on the sneaker, I’m selecting the needed part and typing a simple prompt:
mid century wooden carved relief of a sneaker
I’m replacing other parts of the image the same way:
Don’t forget that you can find generated parts in Invoke AI folder on your machine. If you want a bit more consistency you can try the same seed for other inpaintings of the same objects:
After all work is done, just save the merged canvas image.
Depth map generation with AUTOMATIC1111 UI
After that you need to make sure you have depth map script installed under “Extensions” tab:
If you don’t, you can install in from “Available” tab:
You can also find the GitHub repo here:
After installation, re-launch AUTOMATIC1111 UI and go to the “Depth” tab:
Here you can try different models for depth map generation. Feel free to experiment, but, personally, I’m using these three in combination:
“res101” (GPU, BOOST checked), gives good equalized result:
“dpt_beit_large_512 (midas 3.1)” (GPU, BOOST unchecked), nice details:
“dpt_hybrid_384 (midas 3.0)” (GPU, BOOST checked), good in selecting main objects and accents, lacks detail:
Upscaling maps
After generating the main artwork and it’s depth maps, I need to upscale it to 8K (texture resolution I’m working with in this example). You can use upscaling in AUTOMATIC1111 UI, or use a simple Upscayl app that has most of the popular models, but lacks parameters and some flexibility:
You can find Upscayl repo and read more here:
Substance Sampler AI: Image to texture
To get some small details from the initial image, I’m using Substance Sampler. It can give some nice results with it’s image-to-texture algorithms, but usually I’m avoiding to create height maps here and prefer the ones I showed above.
I’m generating Normal, Roughness and AO maps with the Substance Sampler.
Substance Designer
Substance Designer is the final tool to combine all the generated stuff into a nice PBR set of texture.
I’m combining all depth maps into one to get a nice balanced result:
Also using a bit of blur to hide some artifacts.
Then I’m adding wood details and color to the initial image procedurally:
This way I can get consistent results even with different styles and colors from Midjourney and also add some missing wood texture details.
Final Render
Using this workflow, I was able to get some nice textures quickly that can be applied to a decimated plane with displacement, used in game development, or even 3D printed with woodfill filaments:
This method can be used in procedural texturing as well as sculpting. I can imagine creating some alpha maps for ZBrush, but have not tried it yet. So I see a great potential here.
Please support me with likes and following here:
https://www.instagram.com/mikevrpv/
Hope this little breakdown was helpful!
Let me know if you have any questions and stay curious!