Volatile — Visibility in JVM

What is Volatile ? Volatile specifier is used to indicate that a variable’s value can be modified by multiple threads simultaneously.

Declaring a volatile Java variable means:

  • The value of this variable will never be cached thread-locally: all reads and writes will go straight to “main memory” ( that is not right at all practically :), but this is what mostly written everywhere. We will see this with an example code )
  • Access to the variable acts as though it is enclosed in a synchronized block

Volatile comes with two major concepts : Visibility and Ordering.

We will cover visibility in this blog. Visibility in Java can be controlled by either : synchronized or volatile.

The main differences between synchronized and volatile are:

  • a primitive variable may be declared volatile (whereas you can’t synchronize on a primitive with synchronized);
  • an access to a volatile variable never blocks
  • because accessing a volatile variable never holds a lock, it is not suitable for cases where we want to read-update-write as an atomic operation (unless we’re prepared to “miss an update”);
  • a volatile variable that is an object reference may be null (because you’re effectively synchronizing on the reference, not the actual object).

Attempting to synchronize on a null object will throw a NullPointerException.

Non-Atomic Treatment of double and long

A single write to a non-volatile long or double value is treated as two separate writes: one to each 32-bit half. This can result in a situation where a thread sees the first 32 bits of a 64-bit value from one write, and the second 32 bits from another write.

Writes and reads of volatile long and double values are always atomic as per Java Spec — https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.7

The Java volatile Visibility Guarantee

The Java volatile keyword guarantees visibility of changes to variables across threads.

In a multithreaded application where the threads operate on non-volatile variables, each thread copy variables from main memory into a CPU cache while working on them, for performance reasons. If the computer contains more than one CPU, each thread may run on a different CPU. That means, that each thread may copy the variables into the CPU cache of different CPUs.

Imagine a situation in which two or more threads have access to a shared object which contains a counter variable declared like this:

public class SharedObject {
    public int counter = 0;

Imagine too, that only Thread 1 increments the counter variable, but both Thread 1 and Thread 2 may read the counter variable from time to time.

If the counter variable is not declared volatile there is no guarantee about when the value of the counter variable is written from the CPU cache back to main memory. This means, that the countervariable value in the CPU cache may not be the same as in main memory

The problem with threads not seeing the latest value of a variable because it has not yet been written back to main memory by another thread, is called a “visibility” problem. The updates of one thread are not visible to other threads.

An example to show how volatile variable helps in multi-thread environment to stop the loop. Thread1 executing run() method and thread2 calls tellMeToStop() function to stop the while() loop.

public class StoppableTask extends Thread {
private volatile boolean pleaseStop;

public void run() {
while (!pleaseStop) {
// do some stuff...

public void tellMeToStop() {
pleaseStop = true;

Now lets see what is the performance impact of volatile variable and how does it really work. Remember I said above about main-memory access : “that is not right at all practically”.

Volatile is always visible, just like accessing main memory. That is not true though. It resides in cache line of the core. ie. L1 / L2 cache. More on L1/L2 ( refer : https://www.extremetech.com/extreme/188776-how-l1-and-l2-cpu-caches-work-and-why-theyre-an-essential-part-of-modern-chips )

Lets assume a scenario where two threads ( T1 and T2 ) accessing two different variable without any contention :

a) Without volatile

b) With volatile

c) With padded volatile

cacheline size : sudo sysctl -a | grep cache.linesize

Output : on 4 core machine

NonVolatile time in milliseconds : 41
NonVolatile time in milliseconds : 48
volatile time in milliseconds : 25179
volatile time in milliseconds : 25172
Padded pair time in milliseconds : 3398
Padded pair time in milliseconds : 3401

How is Padding giving high throughput :

If same volatile ( Say c1 of Core X ) sits also in cache line of another core ( Core Y’s cache line ). An update in c1 by Core X will also update c1 sitting in Core Y’s cache line.

Core X stops the Core Y’s execution ( sending interrupts etc…) and update c1 there too.

Although in our case Core X is only working on c1 and Core Y is only working on c2.

When CPU reads main memory, it reads in a block of 64 bytes and keeps it in its L1 cache.

Therefore most probably when core X wants to update c1 it might also have c2 in its cache and vice-versa. To update c1 by X, it interferes all the cores containing c1 in its cache line.

By having padding we are making sure that c1 and c2 doesn’t fit in same block.

Conclusion : Be a PadMan :) whenever you have volatile variables in multi-threaded environment.

Reference :

Trainologic’s CTO, Shimi Bandiel’s session