Even ‘Flat’ isn’t Really Flat

Creating a Unified Model of On-screen Depth


Over the past two decades, we’ve seen interfaces on the screens we use daily evolve about as much as the devices we’re viewing them on. The trends have come and gone, and will surely return again one day, as we decide that providing a sense of visual depth requires all the tools available, in just the right amounts and combinations.

We’ve seen bold gradients on buttons come and go, we’ve seen realism like brushed metal switches and stitched faux-leather paired with conspicuously-dark shadows completely removed in one announcement, as well as all a wide variety of transparency and translucency being admired and then despised.

It’s time to take a stand and put every technique and effect in its place. It’s time to decide what interface elements deserve what level of depth, and exactly which tools are the best to use to achieve it.

Where perceived depth comes from

Before we get ahead of ourselves, let’s take a step back and look at what creates the impression of depth on a screen as thin as paper. It’s pretty simple really, and it comes from — you guessed it — physical effects we see everywhere around us in life.

Firstly, we perceive depth based on the shape of the object in question, and the way its appearance changes as we move — which we collectively call perspective. If it’s a cube, for example, the sides and edges further away from us appear smaller. Also, as we move around, the further away the object is from us (the greater the depth), the less it moves relative to closer objects.

Secondly, the way light interacts with surfaces affects how we perceive their depth. In fact, we’ve been conditioned to expect the light striking surfaces on a screen to come from a light-source above and to the left of the screen. This means that we expect surfaces facing the upper and left sides of a screen to be darker than those facing the opposite directions. So for these surfaces facing any direction of the screen — rather than being ‘flat’ on the screen — one part of the surface must be higher than another part.

Lastly, we perceive the depth of various objects based on how they appear relative to one another. If one object (A) appears to overlap another (B), we assume A is above B — rather than being below it with B having an area ‘cut out’ of it in the exact shape of A. Also, the implicit light discussed above plays a role; rather than simply coming from an upper-left direction, we imagine it to come from a source that’s a bit higher than the surface of our screen, making higher objects cast shadows on lower objects.

The use of layering to demonstrate depth in Google’s latest Design specifications @ http://www.google.com/design/spec/layout/layout-principles.html#layout-principles-dimensionality

Creating pseudo-depth

Alright, so we understand what causes objects to appear to have depth. It’s time to look objectively at the tools we have available to use to give our interfaces the appearance of depth. We’ll start with the tools that give objects the appearance of depth without the behaviour associated with depth.

Almost as long as we’ve had screens, we’ve had gradients. A gradient is simply a change of colour (in its simplest sense) from one side of a surface to another. Gradients can be changes in hue (from red to green), changes in saturation (from red to grey), changes in luminosity (from light to dark), or any combination of these. Specifically though, luminosity gradients give us a sense of depth; they’ve have been used on buttons as long as there have been buttons to click or tap.

Objects that appear to overlap other objects are the second tool we can use, but this doesn’t require any other tool to use. A simple example is the header and footer of a simple website; because we imagine the webpage’s body to extend from the top of the browser to the bottom, the header and footer appear to overlap that body. The same is true for sidebars of horizontally-centered websites; because we expect the body to be horizontally-centered, the sidebar appears to overlap it.

Shadows have been discussed already, but they’re a simple tool we’ve been able to use for ages. Shadows can be cast by surfaces on other surfaces (inset shadows), or on objects by other objects (drop shadows). Even more complex — though this isn’t easy to implement with today’s tools — realistic shadows are sharper closer to the object casting them — basically perspective of shadows.

Translucency and transparency allow one object to appear to be above another object because we can see the object below through the object above — generally speaking. Translucent really means an object allows light to pass though, but how clearly we can see the object behind depends on how transparent the object is. For instance, if we can see objects that are ‘below’ a whitish surface that is kind of ‘blurry’, then the surface is partly transparent and partly translucent.

We used to call objects on a screen that appear to have the right shape to have perspective (as explained above) ‘3D’. Our ability to make objects appear to have an accurate shape for perspective has mostly been defined by the software we use, and our own skill and imagination. However, the change in position and shape of objects in interfaces as we move has been highly limited; because this requires hardware that can detect our movements, and the complexity of the software it would need. The easier it is to determine how the viewer is moving, the better the perspective we can create. Becoming common is simply changing the position of ‘deeper’ objects in one or two directions as the viewer’s position changes, which we call parallax.

Specifying explicit depth

Having used all these tools in our toolbox, we come to a realisation: no matter how contrasting the gradient, how dark the shadow, how translucent the surface, or how large the object, these effects that give the appearance of depth mean nothing if another object simply covers them. Another object that has an explicit ‘depth’ value that is greater (or lesser, depending on the context) will cover the other objects, meaning it functions and acts like it isn’t as deep; it’s higher. If we’re to give the appearance of depth, it makes sense to give the functionality of depth at the same time.

To do this, we’re given a property called z-index. We can understand this if we picture a ‘graph’ on a piece of paper, with 2 ‘directions’: up (called the y-axis) and right (called the x-axis). Now if the graph has a third dimension coming out of the paper and towards us, that would be the z-axis. If an object (A) has a higher value than another object (B), object A will be closer to us than B; not just appearing to have a different depth, but acting like it too.

Pairing the concept of z-index with effects like shadows creates a strong sense of depth @ http://www.google.com/design/spec/layout/layout-principles.html#layout-principles-dimensionality

It should now make sense for us to talk about depth in two ways; both in appearance, and function. If we want to give an object a drop-shadow, we need to ask: “what is its z-axis?” If we want one object covering another, we need to ask both “what is the difference between their z-axis values?” and “what effects will it have that give it the appearance of depth?” We need to think of every object and surface in our interfaces as having a specific depth, and apply the relevant tools. We need to have a unified approach to using depth.