While playing computer games did you ever see images from multiple frames getting stitched into a single frame? Did you ever felt that the game is not as responsive as it should be despite having a good RAM and CPU capacity? This post has the answer to these questions.

VSyncs or Vertical Sync — Never heard of it or did you?

VSync is a fascinating subject in itself. Many hardcore gamers would have heard about it. I am not one of those gamers, and I saw VSync for the first time when I was reviewing App Vitals in the google play store console. Then I started to read about it. I have tried to explain the whole idea behind it(which is independent of android) before jumping into android context.

Vishal Ratna
9 min readSep 10, 2020
Photo by LaserWorld LaserBeam on Unsplash

This post will discuss the fundamentals needed to understand VSync and how this is relevant to android developers, particularly in optimizations. Someone smart once said — “Knowing beyond abstractions never does any harm.” That person is me. Ahh! I know it’s a bad joke. Let’s start.

Before we get started on this, we need to understand 2 things correctly.

1. Frame Rate

First, what is a frame, and what determines the frame rate? A frame is a single still image, which is then combined in a rapid slideshow with other still images, each one slightly different, to achieve the illusion of natural motion. The frame rate is how many of these images are displayed in one second.

Once these frames are drawn, these are passed to the display hardware through ports(This can also be a limiting factor in bringing down the butteriness of UI, we will not discuss this now). This brings to another concept,

2. Refresh Rate

Once the images are sent to the hardware, how fast it is able to refresh the screen with the new set of frames received. So, the refresh rate is the rate at which the screen hardware refreshes the screen.

As these terms are related to 2 different sets of hardware, it is very likely that the rates may differ in numbers. To understand it clearly, let us examine each case.

Frame rate > Refresh Rate

Logically, it means that the GPU is producing frames at a higher rate than the display hardware can consume it. When the monitor displays anything on the screen it reads the frame from an area of memory called the framebuffer.

When the display hardware has not completed drawing one frame and GPU overwrites the framebuffer with other upcoming frames, we see a part of first frame and a part of the second frame on the screen. Which is called screen tear.

Credits: https://appuals.com/how-to-fix-screen-tearing/

We all have faced this problem while playing games like counter strike where we move our gun and a tear like the above image happens. Now you know, why it happens and the game’s high frame rate is the culprit. The below diagram shows what happens off the scenes.

Credits: www.google.com

We can see that the frame new frame is pushed while another frame is still in the process of rendering on the screen.

Frame rate < Refresh Rate

Usually, you will not find this happening. But let’s assume this happens so, the screen is going to show the same frame over multiple refreshes. From a visual perspective, I do not think the user going to see any difference. But, modern display systems are adaptive in nature, when they see that the incoming frame’s frequency is less they reduce the refresh rate accordingly.

Now that we have understood the fundamentals, let us try to understand how the VSync works and how it tries to solve the problem.

Whenever we encounter a rate-limiting problem, the obvious solutions that come to the mind are Backpressure, Buffering, and Dropping. We will examine each possibility now.

A. Can we drop frames? NO. Dropping is not an option as we are going to miss what the game engine is rendering.

B. Can we use backpressure to limit the frames generated by the GPU? NO, we cannot. The frame rate will not be governed by refresh rates. Solution discarded.

C. The last solution is buffering. So yes VSync solves this problem using a mechanism called Double Buffering[Will talk about it later in this post. Hold on!]

Vertical Synchronization or VSync

It is a mechanism that synchronizes the frame rate with the refresh rate of the display hardware. The below diagram shows what we want to achieve here.

Credits: www.google.com

It makes a rule that before the current refresh cycle is completed GPU will not copy any new frames to the frame buffer.

Ok, so what is represented as a “buffer” in the above diagram actually is a double buffer( it can be three sometimes called triple buffering used to eliminate the pain points of double buffering). So, what is the purpose of 2 buffers?

One buffer is used by GPU where it writes the new frames. It is called a back buffer and the frame-buffer used previously is now called a front buffer. So the rule is:

“Frames will moved from back to front buffer only when the current refresh cycle of the display hardware is completed.” Which makes sure that you see a smooth UI.

Now, what can go wrong? There is still a catch. While playing games which demand quick responses to onscreen events because the GPU already has two or three frames rendered and stored in the buffer beyond what you’re seeing onscreen at any given moment.

That means that while the GPU is rendering images in direct response to your actions, there is a miniscule delay (measured in milliseconds) between when you perform those actions and when they actually appear onscreen. We all have faced this right? We do something and the actual response to that action appears after sometime. VSync should be taking the blame! Let’s solve this using Ping-Pong Buffering.

Ping-Pong Buffering — It’s good to know about it.

We saw that there is a minuscule amount of delay using VSync while playing immersive games. To reduce that delay, hardware designers came up with a new concept of Ping-Pong Buffering.

Ping-pong buffering doesn’t have the same input lag; rather than a straight frame buffer which just backs up excess frames and feeds them to the monitor one at a time, this method of vertical synchronization actually renders multiple frames in video memory at the same time and flips between them every time your monitor requests a new frame. This kind of “page flipping” eliminates the lag from copying a frame from the system memory into video memory, which means there’s less input lag. This can be compared to the pre-fetching concept. Fetch the data in advance and show it at the right time.

There is one more way for VSyncing ie Triple Buffering, it is a little complex than the other two. We can skip that for now(This is used internally in android.)

Now that we understand the whole story behind VSync, we will now start analyzing how they come into the picture in android and app optimization.

VSync — The Android Side of Story

Vertical synchronization was introduced in android in Jelly Bean 4.1 as a part of Project Butter to improve the UI performance and responsiveness. As per the official “about” page of Jelly Bean —

“Everything runs in lockstep against a 16 millisecond vsync heartbeat — application rendering, touch events, screen composition, and display refresh — so frames don’t get ahead or behind.”

Let’s see what problem haunted android before 4.1 which project butter attempted to solve.

Before Vsync is introduced, there was no synchronization happening for input, animation, and draw. As and when input came it was handled, when there was animation or change in view then it was handled then and there which resulted in too many CPU operations it was difficult to handle input events when there was some animation going on.

As there was no sync happening between these 3 operations, so redraws happened on input handling as well as on animation and for changes in the view also.This created a problem as it exhausted the CPU cycles, input processing was not smooth.

Post-Android 4.1 — Project Butter & Inception of VSync

VSync is an event posted periodically by the android Linux kernel at a fixed interval where the input handling, animation, and window drawing happens synchronously in a standard order.

VSync signals are delivered at an interval of 16.66 ms which equates to 60fps. Input handling, animation, and window drawing happens on the arrival of this signal. If input events arrive before the VSync signal they are queued. After this animation is handled and then UI redraws are done.

Now, we know from where this term 16ms frame came into the picture and why android developers talk about it while looking at app optimization. This signal is received by the Choreographer class which some of you might have heard. We will cover that in detail in some other post if required.

Choreographer

For now, it is just an abstraction that receives timing pulses(Vsync) from the lower sub-system and delivers the commands to the higher sub-systems to render the upcoming frames.

Each thread which has a looper will have a separate Choreographer instance. Conclusively, each HandlerThread instance will also have it own Choreographer.

VSync and app performance

We know that apps are locked with 16.66ms timing pulses, so what if your app does something which lasts more than 16ms? You have guessed it right, it will miss the VSync signals and it would in turn lead to multiple performance-related issues. Choreographer class has a callback onVsync() which handles the Vsync signals.

As a good android citizen, our apps should finish our rendering work within 16.66 ms, and they should let the main thread do 3 things which it is meant for.

  1. Input Handling
  2. Drawing/Redrawing UI
  3. Running smooth animations

The problems can be app-specific but some of the operations which could lead to your app miss the timing pulses are:

Loading heavy classes eagerly at the app startup even when they are not required at that time. Use lazy initialisation there.

Doing encoding/decoding bitmaps on the main thread.

Loading big shared preferences on the main thread. Android does a good amount of caching here, but still devs can mess up here.

When doing repetitive operations avoid allocation of unnecessary objects, such as in onDraw() of View.

Main thread incurs a lot of cost in inflating if the XML UI has a deep hierarchy. Use relative layouts or constraint layout to flatten that.

If you miss the VSync then the choreographer class keeps track of the last frame that came while the main thread was busy in heavy work using an integer mFrame. It gets overridden every time a new frame comes in. Once the main thread is free, it renders the last frame as the first thing in the next VSync. It keeps track of the frame number through mFrame integer and calculates how many frames were missed while the main thread was busy. You would have seen the below message sometimes on the logcat. Yes, it comes from the Choreographer.

Let’s see how that works, there is a separate thread for Choreographer. That receives VSync signals. There is a callback that happens on every VSync,

onVsync() — This is called every 16ms from the android kernel. It posts a message on the FrameHandler.

FrameHandler — A handler class that takes care of posting the main thread work in its onHandleIntent() callback. Something like below,

mFrameInfo.markInputHandlingStart();
doCallbacks(Choreographer.CALLBACK_INPUT, frameTimeNanos, frameIntervalNanos);

mFrameInfo.markAnimationsStart();
doCallbacks(Choreographer.CALLBACK_ANIMATION, frameTimeNanos, frameIntervalNanos);
doCallbacks(Choreographer.CALLBACK_INSETS_ANIMATION, frameTimeNanos,
frameIntervalNanos);

mFrameInfo.markPerformTraversalsStart();
doCallbacks(Choreographer.CALLBACK_TRAVERSAL, frameTimeNanos, frameIntervalNanos);

So, choreographer can track that the previous message that it sent is completed on not. In this way, when the new Vsync signal is received it can check, how much time it took to process the previous message and can calculate number of skippedFrames and log an error. While the previous message is being processed the choreographer does not post any new messages.

Log.i(TAG, "Skipped " + skippedFrames + " frames!  "
+ "The application may be doing too much work on its main thread.");

There are various benchmarks against which we should be profiling.

  1. Input latency — 24ms
  2. Operations on the main thread — 8ms
  3. Bitmap upload to GPU [Using big resources] — 3.2ms
  4. Uploading draw commands to GPU — 12ms

This information is available if we take a look at the flame chart of the android profiler. Hovering over the data shows the amount of time it took to execute on the particular thread and we can also navigate to the respective block of code which is executing there and examine the problems.

Flame chart: Android profiler.

Hope this helps.

--

--

Vishal Ratna

Senior Android Engineer @ Microsoft. Fascinated by the scale and performance-based software design. A seeker, a learner. Loves cracking algorithmic problems.