Pixelizing 3D objects

An exploration of rendering 3D objects to look like 2D pixel art sprites.

12 min readSep 14, 2020

One of my fondest memories is of my brother and I playing through Breath of Fire IV on the original PlayStation. It’s a great game for many reasons, but in my opinion one of its most distinctive aspects was the art style:

A battle sequence. The scene is 3D objects, but the characters are not!

Outside of battles, the sprite/3D mix continues.

The scenes were mostly 3D objects (albeit low poly), with hand drawn 2D sprites for the characters. This gave a really distinctive style to the game.

Breath of Fire IV used sprites for characters because they allowed a greater level of detail than could be achieved with 3D models, given the hardware of the time. Nowadays, hardware has no problem drawing meshes with millions of triangles. Even so, pixel art is still extremely popular and widely used in video games due to its crisp and distinctive style.

The farm simulator **Stardew Valley** received critical acclaim, and won praised for its cheerful aesthetic.

Moonshell Island also has a vibrant pixel charm to it.

The upcoming Zealot mixes 3D and 2D art styles in a way that is reminiscent of Breath of Fire IV, albeit with a grubbier palette!

Songs of Conquest also mixes 2D sprites with a 3D world.

Capturing that pixel charm

I wanted to capture that ‘Breath of Fire’ aesthetic for my own video game. Although hand-drawn 2D sprites can look fantastic, there are a few disadvantages that come with using them:

Draw order — sorting issues become apparent in more complicated scenes. Intersections between objects is also not possible, because sprites are flat. Think of two spheres, side-by-side — one will always be drawn in front of the other, even if the geometry should be intersecting.
I’m not good at pixel art — I can make 3D models, but I am much less capable at drawing nice-looking pixel art. This is definitely a personal thing, but I am not alone!
Drawing sprites from different angles and animations are tedious— Let’s say I finally draw my sprite and get it to look good. Now I want to view it from another angle, so I have to draw it again…and again…and again… Now imagine I want to animate it, for example to show a character running. It adds up quickly to a lot of images that each must be illustrated! For instance, for only a 5 frame walk cycle, seen from 45 degree angles, I would require 5x8 = 40 different sprites to be drawn.
Lighting and shadows — sprites can be correctly lit if the artist has also made a normal map, but this requires extra work on top of drawing the sprite in the first place. Proper shadows in a 3D scene are not possible, because the sprite lacks depth.
Attachments/equipment gets hard — this is game specific, but comes up frequently. Imagine that I have a character, and that character holds an item — lets say a sword — which I want to be able to swap during gameplay. In what order should the character and the item be drawn? Is the sword in front of the body? Well, yes, so I should draw my character’s sprite before the sword. But the character’s hand is also in front of the sword, so I need to draw that after I draw the sword. This can be solved by managing draw orders and breaking character sprites up into different layers, but it quickly becomes tedious.

Most of these problems are much easier to solve if you use 3D objects instead of sprites. In that case, all you need to do is find a way to render 3D objects as if they were pixelized — easy, right?

This article discusses a few different approaches I tried towards achieving that effect. As you probably guessed by the article length, there are going to be a few subtleties which make this a little bit complicated. This article won’t discuss other hallmarks of the pixel art style —such as reducing the color palette or adding cel-shading — they have already been addressed in good detail elsewhere. In the interests of transparency, I’ll say that I eventually developed the final method into the ProPixelizer asset (with a few added bells and whistles, such as cel-shading and object outlines).

So then, on to the main article! I’ll discuss three very different approaches below.

Attempt #1: Render to texture, then render sprites

The first method I tried was as follows:

The pixelized object is rendered to an off-screen render target. This render target has a very low resolution, e.g. 64 by 64, so that after rendering the object we get a low-resolution texture, which is essentially a sprite of the object.
The low-resolution sprite is applied to a quad in the (unpixelated) scene.
The final scene is rendered, with the pixelized objects hidden and replaced by their quads.

As seen in the video above, the method does give a pixelized look! There are a number of pixelization assets on the Unity store which do something similar to this technique.

Compared to hand drawing 2D sprites, this method fixes the issues of lighting, of animation, and of viewing the object from different angles. It is also compatible with other methods of changing the appearance of the object, such as reducing the color palette or adding cel-shading. For instance, here is a run cycle that I implemented using this method in UE4:

However, this method was not without drawbacks! I really wanted to allow each object to move at the screen resolution, but this required having a separate off-screen render target for each moving object in the scene, which would become a serious performance overhead for large numbers of objects. It still worked reasonably well for small scenes, as seen in the next video.

This method also doesn’t really fix draw order, or give correct intersections between objects. It is also incompatible with full shadows: the pixelated objects are rendered separate from the scene, so can’t receive shadows from it*. When the scene is rendered, the objects are replaced by flat quads, so they can’t really cast correct shadows in the scene either.

*since I played around with this method, Unity has released the Scriptable Render Pipeline. It’s possible you could now modify the shadow casting to include the original objects and exclude the quads when rendering the shadow maps, then sample these shadow maps when determining shadows on both the scene and pixelized objects. However, I’ve moved on from this method.

Attempt #2: Baked sprites from 3D objects

My main concern with Attempt #1 was performance for large numbers of objects. All those extra render targets! Also, the lack of a proper solution to sorting and intersections bothered me.

For the next attempt, I tried baking sprites from 3D objects. This method is not a real-time method; the baking had to be done at the editor/creation stage. However, once baked the objects would be fast to render. The method would still save me the effort of hand drawing hundreds of animated sprites.

To that end, I wrote a small tool which would:

Load the desired object into a sample scene where the camera and low resolution Render Target were already configured.
Render the object from different angles. I used the render to extract color, normal and depth information from the scene. These are saved to different sprite sheets.

When rendering the baked sprites, I used the normals to compute lighting, and the depth information to calculate the ‘actual’ depth of the flat quad displaying the sprite. This allowed me to have ‘3D’ sprites which would properly obey intersection and sorting, as shown in the video below for two spheres:

Intersection of two sphere sprites, using a custom depth from the texture.

This method worked for sorting and intersection, and was fast! However, shadows were still not possible — the depth information held in the texture is more like a heightmap than a full 3D object, so it is not possible for the baked sprites to cast the shadows expected of the original 3D objects into the scene, or onto other sprites*. In addition, it would never work for procedural animations like ragdoll physics and cloth, because everything had to be baked in advance.

*Looking back, armed with more understanding about how shadowcasting works, I think it should be possible to get the sprites to receive the correct scene shadows — I never actually got that far on my implementation, though. Shadows between pixelized objects would still not be possible.

Attempt #3: Dithered rendering + full-screen pixelization post process

Given the shortcomings of the previous methods, I wanted to find a technique that:

works in realtime
supports shadows and shadowcasting
has correct intersections/depth

This final method is demonstrated in the pictures below, and is the one I eventually used for ProPixelizer. I haven’t seen this method used before, so if anyone knows of a previous example then please let me know!

A pixelized vehicle from my game. The shadow is unpixelated.

A mixed scene with pixelized objects, lighting, shadows and outlines.

This technique involves both a dithered material (which is applied to pixelized objects), and a full-screen postprocess that performs the actual pixelization.

Silhouettes

You may immediately question this. Why do I need a post process in addition to an object material? Why is it not possible to achieve the effect by just applying a pixelization material to the object?

Before we get stuck into the technical details, let me emphasize a key aspect of pixel art: the silhouette matters. A crisp and accurate silhouette is essential to how we perceive shapes in low resolution images. I can’t really say it better than this illustrated example by Franek, below:

Any good pixelization effect has to get the silhouette right!

The problem for us is that the screen pixels that should be drawn depend on whether or not that object is pixelized, and this is particularly noticeable around the silhouette. To illustrate this, consider the below image of the same stone face, pixelized to different amounts.

A stone face. **Top left:** pixelised into 5x5 screen pixels. **Top right:** 4x4 screen pixels. **Bottom left:** 3x3 screen pixels. **Bottom right:** 2x2 screen pixels. Different color palettes were also applied.

Let’s imagine the screen pixels around the top-left of each stone head.

The light grey grid shows the location of screen pixels. **Left:** the geometry of an unpixelated object (blue), and the pixels that it occludes (indicated by the blue line). **Right:** pixels occluded by the same object when pixelized into larger pixels, each occupying squares of 5x5 screen pixels.

The left image shows how the outline would look for an unpixelized object, while the right shows pixelized object. The next image highlights the differences: black areas are screen pixels that the (unpixelised) object occupied; dark blue are screen pixels that the object should draw, but will not.

These differences are why we need to employ a postprocess — to correct artefacts in the silhouette that arise from screen pixels that would not otherwise drawn.

A breakdown of the method

To render pixelized objects, we use the following approach:

Object shader: Render the object dithered, filling only a small number of pixels.
Postprocess: Fill the pixels around each dithered pixel, transforming the image from a dithered one into a pixelized one.

The dark blue squares show the dithered pixels to draw. The final look after the post process is shown in light blue.

I’ll give some concrete examples below, written in Unity’s ShaderLab syntax. Note that this isn’t the exact final shader code used in ProPixelizer, I’ve cut it back to a minimum working example.

The object shader

The object shader is relatively simple. We get the coordinate of the pixel being drawn to the screen, and then decide whether or not to draw or discard the pixel. To discard, we can either use the clip command or set the alpha to 0. I took the alpha method — it is less performant, but synergizes better with ShaderGraph.

#define ROUNDING_PREC 0.999
#define PIXELSIZE 5.0inline void PixelClipAlpha_float(float4 posCS, float alpha_in, out float alpha_out) {
   alpha_in = clamp(round(alpha_in), 0.0, 1.0);
   xfactor = step(fmod(abs(floor(posCS.x)), PIXELSIZE), ROUNDING_PREC);
   yfactor = step(fmod(abs(floor(posCS.y - PIXELSIZE)), PIXELSIZE), ROUNDING_PREC);
   alpha_out = alpha_in * xfactor * yfactor * alpha_in;
}

The above function can be called from a fragment shader, and will return 0.0 when a pixel should be discarded and 1.0 when it should be drawn. It essentially checks whether the screen pixel coordinate (given by the position in clip space, posCS) has x and y divisible by the PIXELSIZE. I added the alpha_in input so we can still do cutout alpha testing.

The postprocess

For the postprocess, we search a 5x5 neighbourhood around each screen pixel and find the nearest pixel to the camera. We then output that color and depth:

// Copyright Elliot Bentine, 2018-
// 
// A shader used to pixelise render targets.
Shader "Hidden/Pixelization" {
 Properties{
 }SubShader{
 Tags{
  "RenderType" = "TransparentCutout"
  "PreviewType" = "Plane"
  "RenderPipeline" = "UniversalPipeline"
 }Pass{
  Cull Off
  ZWrite On
  ZTest Off
  Blend OffHLSLPROGRAM 
  #pragma vertex vert
  #pragma fragment frag
  #include "Packages/com.unity.render-pipelines.universal/ShaderLibrary/Core.hlsl"
  #include "Packages/com.unity.render-pipelines.core/ShaderLibrary/Common.hlsl"
  #include "PixelUtils.hlsl"
  #pragma shader_feature DEPTH_BUFFER_OUTPUT_ONUNITY_DECLARE_DEPTH_TEXTURE(_CameraDepthTexture);
  uniform sampler2D _MainTex;
  float4 _MainTex_TexelSize;struct v2f {
   float4 pos : SV_POSITION;
   float4 scrPos:TEXCOORD1;
  };struct appdata_base
  {
   float4 vertex   : POSITION;  // The vertex position in model space.
   float3 normal   : NORMAL;    // The vertex normal in model space.
   float4 texcoord : TEXCOORD0; // The first UV coordinate.
  };v2f vert(appdata_base v) {
   v2f o;
   o.pos = TransformObjectToHClip(v.vertex);
   o.scrPos = ComputeScreenPos(o.pos);
   return o;
  }void frag(v2f i, out float4 nearestColor: COLOR, out float nearestRawDepth : SV_DEPTH) {float depth, nearestDepth;
   float2 ppos;
   float raw_depth;   // shift of one pixel
   float2 pShift = float2(_MainTex_TexelSize.x, _MainTex_TexelSize.y);   nearestDepth = 1;
   nearestRawDepth = tex2D(_CameraDepthTexture, i.scrPos.xy);
   nearestColor = tex2D(_MainTex, i.scrPos.xy);
    
   // These limits determined by the pixel size
   [unroll]
   for (int u = -2; u <= 2; u++)
   {
    [unroll]
    for (int v = -2; v <= 2; v++)
    {
     //Get coord of neighbouring pixel for sampling
     float shiftx = u * pShift.x;
     float shifty = v * pShift.y;
     float2 ppos = i.scrPos.xy + float2(shiftx, shifty);
     float4 neighbour = tex2D(_MainTex, ppos);
     raw_depth = tex2D(_CameraDepthTexture, ppos);
     depth = Linear01Depth(raw_depth.r, _ZBufferParams);
     
     // Check if the neighbouring pixel is nearest so far - if so, use its value
     bool nearer = (depth < nearestDepth);
     nearestDepth = nearer ? depth : nearestDepth;
     nearestRawDepth = nearer ? raw_depth : nearestRawDepth;
     nearestColor = nearer ? neighbour : nearestColor;
    }
   }
  }
  ENDHLSL
 }
 }
 FallBack "Diffuse"
}

This stripped down shader will pixelize everything in the scene, but it should give you the gist. The postprocess can either be performed using a render feature (as in ProPixelizer), or by using a full screen post process.

Other comments

Unlike the offscreen render target method of Attempt #1, the number of pixels that compose the object depend on it’s screen size. If that changes, eg when the object moves in a perspective projection, or when the object is scaled, then the ‘sprite’ will appear to change resolution. This is not a problem for orthographic projections, because the object size is independent of position.

Performance

It’s worth making a brief remark about the performance of this method. One limitation of the method is that the execution time for the post processing method will scale as the square of the pixel size — for pixelizing objects into 5x5 screen pixels, we already require 25 samples per screen pixel of the scene color and depth textures.

After reading Ben Golus’ great article on outlines I wondered if the jump flood algorithmn could be used to improve performance. My current understanding is that the answer is no, but I haven’t actually found time to test it yet — if I ever do, I’ll be sure to update this article with my results.

The method does not depend on the number of pixelized objects, which eases the concerns I had with Attempt #1. I expect the method actually draws each object slightly faster if you use clip to discard, because the dithering fills an order of magnitude less pixels. However, it’s probably negligible compared to the post process overhead.

I have a mid-range desktop PC. I bought a GTX 1060 about two years ago; it’s aged well but is definitely no longer high-end. I profiled a DX11 build running with a 1920x1080 screen resolution, and found the postprocess time to be quadratic in pixel size (as expected). For the 5x5 example in this article the time taken was 0.72ms. A selection of pixel sizes and post process times are shown in the next graph.

Performance scaling as a function of pixel size. For pixel sizes of 3x3 and larger the scaling becomes quadratic.

For moderate pixel sizes the performance is good enough for webGL (demo here). However, for mobile the performance is terrible due to large number of samples (I can only test on my lowish-end Galaxy A10). It does work though; the issues are only in framerate and not in rendering artefacts, so acceptable frame rates might be achievable on high-end handsets.

Other considerations for pixelart

In this article I have described the method I used for pixelizing objects in ProPixelizer, but the fun doesn’t stop there! There are a few other complications to consider:

You also need to pixelize the depth buffer for post-processing, and have proper mixing with transparent objects that sample the scene depth (eg fog, smoke).
You still need to snap the object positions to the screen pixels to prevent pixel creep.
I haven’t addressed outlines at all, which are really common in pixelart.
You might want to have the pixel size change for different objects.
Color grading and cel-shading to emulate the reduced color palettes of older consoles.

These features are all present in ProPixelizer (I have to make a living somehow!).