Computing your own Depth & Shadow pass into CP3D

Julien Moreau-Mathis
Community Play 3D
Published in
7 min readJan 4, 2015

--

… And make everything hardware

What do you mean ?

Sometimes, if you write shaders, you can modify the initial geometry of your 3D objects using the vertex shader for the current frame. I mean geometry modifiers like waves or hardware skinning for example. The problem is, in most shadowing techniques, that the depth calculation for the shadow map(s) is thanks to a “dept material” that sometimes doesn't take care of geometry manipulations you can compute in your vertex program. Then, it simply transforms the original positions of your 3D objects vertices.

The easy solution, here, should be to modify the original geometry before rendering the object in the CPU side: a very unpopular technique today.

To counter this problem, CP3D gives you a smart solution where you can modify what you want in your vertex programs, without shadowing artifacts ☺. Artifacts like:

The method consists on giving and specifying the specials depth & shadow materials to the rendering engine.

Note: To compute shadows with CP3D, there is 3 passes once we have our shadow maps (depth textures):
- The Shadow pass
- The Color pass
- The Blend pass (that multiplies the shadow pass with the color pass)

What is the advantage ?

Using the easy solution (CPU side), you’ll probably reach very quickly your CPU limits if your scene contains a lot of vertices.
Example : For a large ocean plane, your algorithm is liable to compute the new positions of more than 20 000 vertices. A dirty for loop that will turn your PC very angry. Then, you’ll loose your performances with 100% CPU, nothing on your GPU and you’ll want to start a new job.

The smarter solution is to compute every transformations in the GPU side, using the vertex shader. Then, your CPU is relax and you can give the job to the right expert that is the GPU.

Result:

Create an awesome material

This material is a simple linkage between vertex & pixel shaders. The vertex program computes the new positions of vertices using a “wave function” to simulate ocean waves. The pixel program only draws the pixel using a diffuse texture.

Vertex shader (HLSL)

#include Shaders/InternalHandler/Utils.hlsl.fx” // Include utils for hlslcbuffer cbParams : register(c0) {
float4x4 mWorldViewProj;
float time;
};
struct VS_INPUT
{
float3 Position : POSITION;
float4 TexCoord : TEXCOORD0;
};
struct VS_OUTPUT
{
float4 Position : SV_Position;
float4 TexCoord : TEXCOORD0;
};
VS_OUTPUT vertexMain(VS_INPUT In) {
VS_OUTPUT OUT = (VS_OUTPUT)0;
// THE Ultimate waves function
float3 p = In.Position;
p.y += (sin(((p.x / 0.05) + time)) * 20.0)
+ (cos(((p.z / 0.05) + time)) * 20.0);
float4 hpos = mul(float4(p.xyz, 1.0), mWorldViewProj);
OUT.Position = hpos;
OUT.TexCoord = In.TexCoord;
return (OUT);
}

Pixel Shader (HLSL)

#include Shaders/InternalHandler/Utils.hlsl.fx” // Include utils for hlsl// vertex shader output
struct
VS_OUTPUT
{
float4 Position : SV_Position;
float4 TexCoord : TEXCOORD0;
};
CP3DTexture ColorMapSampler : registerTexture(t0);
SamplerState ColorMapSamplerST : register(s0);
float4 pixelMain(VS_OUTPUT In) : COLOR0
{
return CP3DTex2D(ColorMapSampler, In.TexCoord.xy,
ColorMapSamplerST);
}

Giving only this material, the shadow map calculation (depth map) will not take care of the transformations you made in your the vertex shader. So, the 3D object is static for the depth pass because the rendering engine applied the default depth material. That’s why we have to use the sacred file named Utils.hlsl.fx

Add Custom depth pass

Using the Utils.hlsl.fx header file, the depth pass is easy to implement. It plays with the conditional compilation of your programs.
To add the depth pass, simply add these lines in your vertex shader.

In the output structure:

struct VS_OUTPUT
{
float4 Position : SV_Position;
// I separate here for the example but we only need TEXCOORD0 here
#if defined(CP3D_COMPUTE_DEPTH_MATERIAL)
float4 ClipPos : TEXCOORD0;
#else
float4 TexCoord : TEXCOORD0;
#endif
};

In the vertex program:

[…]
float4 hpos = mul(float4(p.xyz, 1.0), mWorldViewProj);
OUT.Position = hpos;
#if defined(CP3D_COMPUTE_DEPTH_MATERIAL)
OUT.ClipPos = computeDepthVertex(hpos);
#else
OUT.TexCoord = In.TexCoord;
#endif
[…]

As you can see, we’re playing with the HLSL preprocessors to compile only the code that will be executed by the GPU for our custom depth pass.
In the vertex program, if “CP3D_COMPUTE_DEPTH_MATERIAL” is defined, we call a function named “computeDepthVertex(float4);” passing our final transformed position (hpos here). This function is defined in the Utils.hlsl.fx file and returns a value computed in function of the max distance for the shadow map, automatically.

In the pixel program, the same output structure.
And, in the pixel program:

// Declare textures and samplers only if not computing depth
#ifndef CP3D_COMPUTE_DEPTH_MATERIAL
CP3DTexture ColorMapSampler : registerTexture(t0);
SamplerState ColorMapSamplerST : register(s0);
#endif
float4 pixelMain(VS_OUTPUT In) : COLOR0 {
// Depth
#if defined(CP3D_COMPUTE_DEPTH_MATERIAL)
return computeDepthPixel(In.ClipPos);
// Initial
#else
return CP3DTex2D(ColorMapSampler, In.TexCoord.xy,
ColorMapSamplerST);
#endif
}

Here, the difference is that we declare our textures & samplers only if we’re not computing the depth pass and we give the output value ClipPos to the “computeDepthPixel(float4);” function.
The function “computeDepthPixel(float4);” is also defined in the Utils.hlsl.fx file and returns the color of the pixel in function of the distance, automatically.

Conclusion

The method is pretty simple. If we are computing the depth pass (CP3D_COMPUTE_DEPTH_MATERIAL is defined), then we declare our output value (ClipPos here) and we call the appropriate functions defined in the Utils.hlsl.fx header file. Else, we compute our standard code for our custom material.

The result shows that the receive nodes correctly apply the shadows but not nodes that cast shadows:

Add Custom Shadow pass

Once we have our shadow map(s) looking amazing, we have to implement our custom shadow pass.
Something cool with this method is that implementing the shadow pass is easier than the depth pass. We’re still playing with the HLSL preprocessors , a function defined in the Utils.hlsl.fx header file and we have to modify only our vertex shader.

Note: this implementation is not needed if your objects only cast shadows. It’s needed only if you want your objects to receive shadows because you could use light maps. So, maybe no need to receive shadows

To implement the shadow pass, add these lines in your vertex shader.

The input structure:

struct VS_INPUT {
float3 Position : POSITION;
float4 TexCoord : TEXCOORD0;
// If the normal is already defined for your program,
// this line is not needed
#if defined(CP3D_COMPUTE_SHADOWS_MATERIAL)
float3 Normal : NORMAL;
#endif
};

The output structure:

struct VS_OUTPUT {
float4 Position : SV_Position;
// Depth pass
#if defined(CP3D_COMPUTE_DEPTH_MATERIAL)
float4 ClipPos : TEXCOORD0;
// Shadow pass
#elif defined(CP3D_COMPUTE_SHADOWS_MATERIAL)
float4 ShadowMapSamplingPos : TEXCOORD0;
float4 MVar : TEXCOORD1;
// Initial
#else
float4 TexCoord : TEXCOORD0;
#endif
};

In the vertex program:

[…]
float4 hpos = mul(float4(p.xyz, 1.0), mWorldViewProj);
OUT.Position = hpos;
// Depth
#if defined(CP3D_COMPUTE_DEPTH_MATERIAL)
OUT.ClipPos = computeDepthVertex(hpos);
// Shadows
#elif defined(CP3D_COMPUTE_SHADOWS_MATERIAL)
VS_OUTPUT_SHADOWS_MATERIAL outShadows
= computeShadowsVertex(hpos, p, In.Normal);
OUT.ShadowMapSamplingPos = outShadows.ShadowMapSamplingPos;
OUT.MVar = outShadows.MVar;
// Initial
#else
OUT.TexCoord = In.TexCoord;
#endif
[…]

Here, the method “computeShadowsVertex(float4, float3, float3)” is different because it returns a structure containing our output values (ShadowMapSamplingPos & MVar) and takes 3 parameters.
To compute shadows, the method needs the final transformation (hpos), the original R3 position (p) and the vertex normal (Normal in VS_INPUT). Once you added these lines, the rendering engine will link our vertex program with an internal pixel program for the shadow pass. What you have to do is to assign the function’s output values with the vertex shader’s output structure values.

Finally, we got our custom shadow pass ☺

Final result:

How the performances look like ?

Using the ultra awesome Visual Studio integrated tools, I did a comparison between both methods: the easy method and the smart method (cf. demo in Conclusion).

Note: to test your application, use Debug -> Performances and Diagnostics

Here, we can see that the easy method is using my CPU at 100% (25% because the demo wasn't parallel programmed), with 35–40 FPS on an Intel Core-i7 / HD4000. We can also see that the GPU isn’t used at its maximum. It means we can do better using the GPU.

CPU & GPU usages using the easy method

Using the smart method, there is a huge difference. The GPU is almost 100% used and the CPU is as I said (relaxed), with 70–75 FPS in debug mode.

CPU & GPU usages using the smart method

Conclusion

This method is particularly interesting for complex shaders like hardware skinning programs, it allows you to create your own depth & shadow passes only by adding a few lines in your code. In addition, it allows you to focus your 3D transformations in the GPU side, then you can increase your performances and keep your CPU for other stuff like IA or gameplay for example.

The example’s Sources

--

--