The impact of Garbage Collection
As you probably know Java applications run on top of a Java Virtual Machine (JVM). One important task a JVM does is Garbage Collection (GC). A garbage collector spares developers the task of manually allocating and freeing blocks of memory on the heap.
Of course it comes with a price. The collector adds an additional footprint to your Java process. Bookkeeping live objects, dedicated data structures and algorithms takes their toll. GC competes with your application for CPU time.
Not only does a collector compete with your application, it can and will freeze your whole application for some time to do its job. While the collector is operating, your application can’t access parts of the heap or undefined behaviour would ensue. The dreaded Stop-The-World GC pause.
I made GC sound pretty scary and unpredictable. Actually, running a JVM in production has been a pretty smooth experience.
tl;dr: Garbage Collection accounts for 0% of processing time.
Let us see the effect of garbage collection activity on a production machine of mine. It serves a web application to the world.
The production machine is a AWS t2.medium Linux instance. This instance type comes with 2 vCPUs and 4 GiB of RAM. The app itself runs on Apache Tomcat 8.5.x, which runs on top of the Oracle HotSpot JVM (1.8.0_121).
Upon each request the web application fetches data from a database and renders HTML. So it is HTTP to SQL and back. The typical enterprise web application as you would call it. More I/O bound than CPU bound, but nevertheless every request will eventually allocate some objects. There should be plenty of garbage collection happening.
Let’s find out how many requests Tomcat handles on a given day:
$ cat localhost_access_log.2017–05–02.txt | wc -l
47823 hits on May 2. On average one hit every two seconds.
Yes, we are no Twitter, Netflix or Facebook and I can’t right away say these are all valid requests — maybe some crawler every once in a while — but this is the traffic I regularly see throughout the month and it is definitely enough traffic to perpetual trigger the garbage collector.
With JVM options we can tweak the JVM and the GC. We are passing them to Apache Tomcat and thereby to our web application. This is the entry in CATALINA_OPTS.
-server -Xms2048M -Xmx2048M -Xmn512M
The -server argument is superfluous. It is a Linux 64-bit server with more than 2 GiB and more than one core. HotSpot will auto-detect it as a server class machine.
Half of the 4 GiB RAM are assigned to the JVM for use as a memory allocation pool (heap size). The initial -Xms and maximum -Xmx heap size options are set to the same value, sparing the JVM from dynamically adjusting the heap size. This approach is along the guidelines of heap size regarding server applications.
A quarter of the assigned 2 GiB memory pool is reserved -Xmn for the young generation. Again, this option is superfluous and was not smart to be set in the first place. Per default, the ratio between young and old generation is 1:3. A quarter. So I set a value which would have been the default value to begin with.
(The young generation is the part of the heap where new objects are allocated. Longer living objects are moved to the old generation.)
Since I did not state a garbage collector explicitly the JVM uses the Parallel Collector (or Throughput Collector) by default.
jstat is command line utility that prints performance statistics of a running JVM. You can select statistic about class loaders, the compiler and garbage collection. It is small, fast and works remotely in a shell. If you installed the Oracle JDK you already have jstat.
We will focus on the output of -gcutil which displays a summary of garbage collection statistics in percentage.
$ sudo jstat -gcutil -t 19644
Note: you might not need sudo
Unfortunately the columns don’t fit my small terminal window.
The first column Timestamp is the time in seconds since the JVM was started. Compare the 2481904.4 seconds with the previous screenshot of top and you will see that this JVM has been running for almost a month.
The next columns SO till CCS show heap utilization as a percentage of the heap size for the different generations: young, old and metaspace. Interesting indeed, but not of interest for us.
That leaves us with the columns YGC, YGCT, FGC, FGCT and GCT.
There have been 4029 garbage collection events in the young generation (YGC), which accounted for 116.345 seconds. (YGCT). So on average garbage collection in the young generation took around 29 ms.
There have been only 7 Full Garbage Collection Events (FGC), which accounted for 1.815 seconds (FGCT). This is the dreaded Stop-The-World GC pause everyone is afraid of. It took on average 259 ms to complete.
The last column is the the total time spent in Garbage Collection (GCT) = YGCT + FGCT.
Lets run the numbers. The JVM was started 2481904.4 seconds ago and 118.160 seconds were spent in garbage collection.
118.160 / 2481904.4 = 0.00004760860249089368.
That is a big 0%.
In 28 days only 118 seconds were spent with Garbage Collection. GC accounts for 0% percent of application time.
Pretty impressive ratio if you ask me. The best part: I didn’t have to do anything. The JVM just runs.
What about latency?
I have been unfair. From a throughput point of view the GC activity is neglectable. Web applications are designed for humans and at most we perceive seconds. So if a thread freezes for 259 ms that still leaves plenty plenty of time until our users gets upset.
Unless your user is a machine. Then 259 ms is a lot of wasted time.
To make matters worse the 259 ms is just an average. There have been pauses that took longer.
If you need more predictability and shorter pause times then you must tweak the JVM for latency instead of throughput. Luckily the HotSpot JVM already has to collectors to meet this goal. Have a look at the CMS and G1 collector.
However considering our use case the Parallel Collector is more than sufficient.
Here are two resources I would recommend anyone interested in the Oracle HotSpot JVM and Garbage Collection: