Batching tamed: reducing batches via UI mask optimization

Published in

MY.GAMES

10 min readMay 5, 2023

We look at a real case with batching where we optimize masks in Unity to reduce the number of batches, while simplifying the object hierarchy and increasing interface FPS in the process.

Let’s talk about interface optimization tasks, which is something we must deal with quite often. In particular, at times, we must debug screens which have grown in complexity as time has passed. Although these screens were working fine before, behind the scenes, as their complexity has grown, so too does a buildup of problems, and these can begin to snowball. This accumulation continues until, finally — these issues become visible to the naked eye. At this point we face a choice: we can either redo everything and set the ship right from the start, or we can choose to solve the problems one by one.

To illustrate one case we’ve faced like this, at a certain point during our work with War Robots, we faced the need to optimize the offers screen: as it turned out, Unity required more than 300 batches to draw this screen. For comparison: a much more complex hangar screen comprising a 3D scene, as well as 3D and 2D interfaces, plus effects and animations was drawn in about 100 batches.

My name is Sergey Madekin, and I’m a Senior Developer at Pixonic, MY.GAMES. In this article, we’ll look at another case we tackled and solved in our work, and talk about how we managed to fix dynamic batching, simplify the hierarchy, and increase the FPS in the interface.

Understanding batches and batching

First of all, let’s understand what a batch is. A batch is a single command from the CPU that contains data and an instruction that the GPU uses to create an image on the screen. In fact, one frame consists of many batches (just like layers in any graphics editor.) Now, this doesn’t mean that decreasing the number of batches will result in higher FPS, but we can often get a performance increase by performing this kind of optimization.

In Unity, it’s possible to enable an automatic process combining such commands — batching. If two or more commands in a row must be drawn with the same material, then the data from all these commands is combined and sent in one batch.

Looking at the number of batches before we did anything

Let’s take a look at our example. To display a promotional product, a prefab of the following type is created:

As you can see, this fairly simple prefab requires quite a lot of batches to render — 26 (these statistics also take into account one batch from the camera that updates the background). But the picture becomes much worse when creating the same prefab again:

The number of batches doubled, which means that batching between identical entities was completely broken! This is the price we had to pay for using the standard Mask Unity component, which we needed for the diagonal stripes in the background. It didn’t work the way we needed it to, but we also didn’t know the exact reason why.

This is what that looks like within the hierarchy:

Let’s highlight the objects masks have been applied to:

new-back — limits the rendering of the image to the boundaries of the prefab
angle-glow — creates a diagonal ribbon by changing the data in Transform component

It’s worth noting that the gradient on the ribbons is achieved using a tinted grayscale texture: in this way, we can implement any color and any simple shape without the use of additional textures.

Unity can combine the drawing of multiple elements into a single draw operation (a batch or draw call). But this is only possible if the same textures and the same material are used.

However, here, now we can see the problem of using a Mask component, which I mentioned before is clearer: it completely breaks batching. Moreover, the component itself adds two batches: both before, and after drawing the sprite affected by the mask, with special shader settings. Because of this, it becomes impossible to batch sprites within the same prefab and between adjacent prefabs.

Thus, it takes ten batches to draw just four sprites. Also, several prefabs cannot be batched together — and considering the poor optimization of prefabs, and even disregarding the masks, the number of batches in a real situation would amount to hundreds.

How we fixed our broken batching

We wanted to keep the visual part completely intact, but, of course, to fix the broken batching. The standard Mask component didn’t work the way we needed, so we created our own masks to work with the UI. (At the same time, it was necessary to come up with a universal solution that would not require serious resources.)

We merged the two objects together — the mask border and the image. The idea is to draw the image with the mask already applied. Therefore, we needed to create a new material and write a shader for it, in which the shape of the mask would be calculated:

CGPROGRAM
    #pragma vertex vert
    #pragma fragment frag

    #include "UnityCG.cginc"

    #pragma multi_compile __ UNITY_UI_ALPHACLIP
    
    struct appdata_t
    {
        float4 vertex   : POSITION;
        float4 color    : COLOR;
        float2 texcoord : TEXCOORD0;
    };

    struct v2f
    {
        float4 vertex   : SV_POSITION;
        fixed4 color    : COLOR;

        float2 uv : TEXCOORD0;
    };

    fixed4 _Color;
    fixed4 _TextureSampleAdd;

    v2f vert(appdata_t IN)
    {
        v2f OUT;
        OUT.vertex = UnityObjectToClipPos(IN.vertex);
        OUT.color = IN.color * _Color;
        OUT.uv = IN.texcoord;
        return OUT;
    }

    sampler2D _MainTex;
    fixed4 _MainTex_ST;
    sampler2D _AlphaTex;
    fixed4 _AlphaTex_ST;

    fixed4 frag(v2f IN) : SV_Target
    {
        float4 color = (tex2D(_MainTex, IN.uv * _MainTex_ST.xy + _MainTex_ST.zw) + _TextureSampleAdd) * IN.color;
        const float mask_alpha = (tex2D(_AlphaTex, IN.uv * _AlphaTex_ST.xy + _AlphaTex_ST.zw) + _TextureSampleAdd).a;
        color.a *= mask_alpha;
        return color;
    }
ENDCG

A shader like this can help us eliminate our batching woes — but it still has one significant problem: you can only use the same mask texture for two different images if the shape on which the mask should be taken is exactly the same. In our case, the thickness of the ribbons is different, so that’s why we had to use different textures. And since the ribbons had to be cut diagonally, we needed to use these technical textures at a high resolution, thus taking up memory. (If we did not, anti-aliasing problems would arise, and the image would become jagged.)

Therefore, we continued looking for a solution, and we came up with a way to set the shape of the mask in a new way. To describe it, we need to recall how Unity draws an image.

Image is a component that takes data from RectTransform of the GameObject where it’s located. RectTransform has four coordinate vertices, as well as four standard UV coordinates, one for each vertex: [(0, 0), (1, 0), (0, 1), (1, 1)]. In the code, we can change the coordinates, and we can also use other sets of UV coordinates: eight sets of UV coordinates are available for regular meshes, but Unity UI only supports up to four sets. We thought, why don’t we use some other coordinates to determine the shape of the mask? No sooner said than done.

First of all, we needed to ensure our Canvas has an additional UV channel enabled:

Now, we needed to expand the functionality of Image so that it could read data from this channel and pass it to the mesh, from which the shader will then be read:

public class ImageWithCustomUV2 : Image
{
 [SerializeField] private Vector2[] _uvs2;

 protected override void Start()
 {
  base.Start();
  if (!canvas.additionalShaderChannels.HasFlag(AdditionalCanvasShaderChannels.TexCoord1))
  {
   canvas.additionalShaderChannels |= AdditionalCanvasShaderChannels.TexCoord1;
  }
 }

 protected override void OnPopulateMesh(VertexHelper vh)
 {
  base.OnPopulateMesh(vh);
  if (_uvs2?.Length != 4)
  {
   return;
  }

  var vertex = new UIVertex();
  for (var i = 0; i < 4; ++i)
  {
   vh.PopulateUIVertex(ref vertex, i);
   vertex.uv1 = _uvs2[i];
   vh.SetUIVertex(vertex, i);
  }
 }

To simplify the coordinate settings, we wrote a custom inspector.

Here’s the code for that inspector:

[CustomEditor(typeof(ImageWithCustomUV2))]
public class ImageWithCustomUV2Inspector : ImageEditor
{
    private readonly string[] _options = {"Custom", "Rectangle"};
    private readonly GUIContent _blLabel = new GUIContent("Bottom left");
    private readonly GUIContent _brLabel = new GUIContent("Bottom right");
    private readonly GUIContent _tlLabel = new GUIContent("Top left");
    private readonly GUIContent _trLabel = new GUIContent("Top right");
    private bool _foldout = true;
    private int _selectedOption = -1;


    public override void OnInspectorGUI()
    {
        base.OnInspectorGUI();
        var prop = serializedObject.FindProperty("_uvs2");
        if (prop.arraySize != 4)
        {
            ResetUVs(prop);
        }


        _foldout = EditorDrawUtilities.DrawFoldout(_foldout, "UV2");
        if (_foldout)
        {
            EditorGUI.indentLevel++;
            DrawUVs(prop);
            EditorGUI.indentLevel--;
        }


        serializedObject.ApplyModifiedProperties();
    }


    private void DrawUVs(SerializedProperty prop)
    {
        if (_selectedOption < 0)
        {
            CheckSelectedOption(prop);
        }


        _selectedOption = GUILayout.Toolbar(_selectedOption, _options);
        switch (_selectedOption)
        {
            case 1: // rect
                DrawRectOption(prop);
                break;
            default: // custom
                DrawCustomOption(prop);
                break;
        }
    }


    private void CheckSelectedOption(SerializedProperty prop)
    {
        var bl = prop.GetArrayElementAtIndex(0).vector2Value;
        var br = prop.GetArrayElementAtIndex(3).vector2Value;
        var tl = prop.GetArrayElementAtIndex(1).vector2Value;
        var tr = prop.GetArrayElementAtIndex(2).vector2Value;
        if (bl.x == tl.x && bl.y == br.y && tr.x == br.x && tr.y == tl.y)
        {
            _selectedOption = 1;
        }
        else
        {
            _selectedOption = 0;
        }
    }


    private void DrawCustomOption(SerializedProperty prop)
    {
        var w = EditorGUIUtility.labelWidth; 
        EditorGUIUtility.labelWidth = 100;
        EditorGUILayout.BeginHorizontal();
        DrawVector2Element(prop, 1, _tlLabel);
        DrawVector2Element(prop, 2, _trLabel);
        EditorGUILayout.EndHorizontal();
        EditorGUILayout.BeginHorizontal();
        DrawVector2Element(prop, 0, _blLabel);
        DrawVector2Element(prop, 3, _brLabel);
        EditorGUILayout.EndHorizontal();
        EditorGUIUtility.labelWidth = w;
    }


    private void DrawRectOption(SerializedProperty prop)
    {
        var w = EditorGUIUtility.labelWidth; 
        EditorGUIUtility.labelWidth = 100;


        var bl = prop.GetArrayElementAtIndex(0).vector2Value;
        var tr = prop.GetArrayElementAtIndex(2).vector2Value;


        var min = bl;
        var max = tr;
        EditorGUILayout.BeginHorizontal();
        min = EditorGUILayout.Vector2Field("min", min);
        max = EditorGUILayout.Vector2Field("max", max);
        EditorGUILayout.EndHorizontal();


        if (min != bl || max != tr)
        {
            prop.ClearArray();
            AddVector2(prop, min);
            AddVector2(prop, new Vector2(min.x, max.y));
            AddVector2(prop, max);
            AddVector2(prop, new Vector2(max.x, min.y));
        }


        EditorGUIUtility.labelWidth = w;
    }


    private void DrawVector2Element(SerializedProperty array, int index, GUIContent label)
    {
        var prop = array.GetArrayElementAtIndex(index);
        EditorGUILayout.PropertyField(prop, label);
    }


    private void ResetUVs(SerializedProperty prop)
    {
        prop.ClearArray();
        AddVector2(prop, Vector2.zero);
        AddVector2(prop, Vector2.up);
        AddVector2(prop, Vector2.one);
        AddVector2(prop, Vector2.right);
    }


    private void AddVector2(SerializedProperty array, Vector2 value)
    {
        var id = array.arraySize;
        array.InsertArrayElementAtIndex(id);
        var prop = array.GetArrayElementAtIndex(id);
        prop.vector2Value = value;
    }
}

Next, with the shader code, only the UV calculation for the texture that the mask is taken from was changed:

CGPROGRAM
    #pragma vertex vert
    #pragma fragment frag

    #include "UnityCG.cginc"

    #pragma multi_compile __ UNITY_UI_ALPHACLIP
    
    struct appdata_t
    {
        float4 vertex   : POSITION;
        float4 color    : COLOR;
        float2 texcoord : TEXCOORD0;
        float2 texcoord1 : TEXCOORD1;
    };

    struct v2f
    {
        float4 vertex   : SV_POSITION;
        fixed4 color    : COLOR;

        float2 uv : TEXCOORD0;
        float2 uv1 : TEXCOORD1;
    };

    fixed4 _Color;
    fixed4 _TextureSampleAdd;

    v2f vert(appdata_t IN)
    {
        v2f OUT;
        OUT.vertex = UnityObjectToClipPos(IN.vertex);
        OUT.color = IN.color * _Color;
        OUT.uv = IN.texcoord;
        OUT.uv1 = IN.texcoord1;
        return OUT;
    }

    sampler2D _MainTex;
    fixed4 _MainTex_ST;
    sampler2D _AlphaTex;
    fixed4 _AlphaTex_ST;

    fixed4 frag(v2f IN) : SV_Target
    {
        float4 color = (tex2D(_MainTex, IN.uv * _MainTex_ST.xy + _MainTex_ST.zw) + _TextureSampleAdd) * IN.color;
        const float mask_alpha = (tex2D(_AlphaTex, IN.uv1 * _AlphaTex_ST.xy + _AlphaTex_ST.zw) + _TextureSampleAdd).a;
        color.a *= mask_alpha;
        return color;
    }
ENDCG

With that, we could use a simple texture as a mask:

Note that the opaque part here is a square that takes up a quarter of the image area, and which is surrounded by a transparent frame. This texture allows you to customize almost any quadrangular shape.

After adjusting the mask shapes, we were able to keep the visuals intact, simplify the hierarchy, and we did so without broken batching and while also using minimal additional data: a texture for a mask in Alpha8 format with a size of 256x256, which only requires 64 KB in memory.

And most importantly, batching between different prefabs has been preserved:

Batching it all up

We managed to simplify the object hierarchy in Unity and reduce the number of batches by several times: from 130+ to ~60. Additionally, the FPS value on the screen has increased by about 5–10%. The price we paid was a slightly more complicated component setup.

In the end, we managed to implement our plans just as we had wanted: screen performance increased, the layout did not change, and the object hierarchy became simpler and more understandable. We also managed to create a universal tool that can be used with other screens and interface elements.

The necessary additional resources turned out to be minimal and, again, universal — nice. What wasn’t so nice, though? Well, setting a specific mask has become a little more difficult, but still, when you understand how the mechanism works, it’s not complicated.

Batching tamed: reducing batches via UI mask optimization

Understanding batches and batching

Looking at the number of batches before we did anything

How we fixed our broken batching

Batching it all up

Written by War Robots Universe