How to choose JVM and Docker container memory properties for our Java service?

Kostiantyn Ivanov
9 min readSep 23, 2023

--

What we will learn?

  • How to configure docker-compose for your java application.
  • How to setup needed startap flags for java application (including heap limit)
  • How to use jcmd for monitoring of a memory consumption of our application

Our service

Pretty simple service with CRUD operations for a “Task” model. Has common dependencies for a modern microservice: relational DB, message broker, obeservaility server

Tools we are going to use

Docker commands:
— docker-compose up/down — to startup/shutdown our test env.
— docker exec <container_id> — to send command into our container (mostly monitoring commands)

jcmd

Command-line utility provided by Java to interact with running Java applications. It’s a part of the Java Management Extensions (JMX) technology, which allows you to monitor and manage Java applications.

We will use:

jcmd VM.flags — to check what our current startup flags

jcmd VM.native_memory — our main diagnostic tool for this article.
This command is used to interact with and query Native Memory Tracking (NMT) data in a Java Virtual Machine (JVM) process. Native Memory Tracking is a feature in Java that allows you to monitor and track native memory usage by the JVM. Native memory includes memory allocated outside the Java heap, such as memory used by native libraries, direct byte buffers, and memory mapped files.

Garbage collectors

Choosing a GC is always a tradeoff between latency and resources. More powerful GC may bring you a super-small extra latency but the resource consumption will be remarkable as well. On the other hand, a simple GC will process slower but- with smaller resources overhead.
There are 3 example of different GC (from cimplest to the most complecated):
SerialGC — the minimal we can use. The slowest one as well as the one with the lowest resource utilization.

G1GC —more powerful GC (and the default one for applications with more than 1GB heap btw). At the same time, it has higher resource consumption.

ZGC — highest resource consumption, lowest latency.

In final tests we will use ZGC as a most performance one. But to explain the idea of how to choose the one covers you needs — we will start from comparation between these three GCs.

First start with 512 MB RAM docker mem limit without any configuration:

We just start our application in a container with 512 RAM limit and check it’s native memory using jcmd (diagrams are made in Google Sheets):

SerialGC

Native Memory Tracking:

Total: reserved=1568175KB, committed=185479KB // 1.5G, 185M
- Java Heap (reserved=131072KB, committed=52828KB) // 128M, 52M
- Class (reserved=1050200KB, committed=11288KB) // 1G, 11M
- Thread (reserved=32947KB, committed=2991KB) // 32M, 2M
- Code (reserved=248960KB, committed=17980KB) // 248M, 17M
- GC (reserved=866KB, committed=618KB) // 0.8M, 0.6M
- Compiler (reserved=248KB, committed=248KB) // 0.2M, 0.2M
- Internal (reserved=208KB, committed=208KB) // 0.2M, 0.2M
- Other (reserved=12KB, committed=12KB) // 0.01M, 0.01M
- Symbol (reserved=17628KB, committed=17628KB) // 17M, 17M
- Arguments (reserved=1KB, committed=1KB) // 0.001M, 0.001M
- Module (reserved=290KB, committed=290KB) // 0.2M, 0.2M
- Safepoint (reserved=8KB, committed=8KB) // 0.008M, 0.008M
- Synchronization (reserved=102KB, committed=102KB) // 0.1M, 0.1M
- Metaspace (reserved=65802KB, committed=61642KB) // 65M, 61M

GC1

Native Memory Tracking:

Total: reserved=1635882KB, committed=255634KB //249.64MB
- Java Heap (reserved=131072KB, committed=70656KB)//69MB
- Class (reserved=1050263KB, committed=11351KB)//11MB
- Thread (reserved=46324KB, committed=3088KB)//3MB
- Code (reserved=248914KB, committed=18130KB)//17.7MB
- GC (reserved=55405KB, committed=53181KB)//52MB
- Compiler (reserved=206KB, committed=206KB)
- Internal (reserved=225KB, committed=225KB)
- Other (reserved=12KB, committed=12KB)
- Symbol (reserved=17554KB, committed=17554KB)//17MB
- Native Memory Tracking (reserved=7366KB, committed=7366KB) //7.19MB !!!
- Shared class space (reserved=12288KB, committed=12092KB) //11.8MB
- Arena Chunk (reserved=187KB, committed=187KB)
- Module (reserved=149KB, committed=149KB)
- Safepoint (reserved=8KB, committed=8KB)
- Synchronization (reserved=110KB, committed=110KB)
- Metaspace (reserved=65792KB, committed=61312KB)//60MB

ZGC:

Native Memory Tracking:

Total: reserved=16162172KB, committed=306548KB // 16G, 300M
- Java Heap (reserved=6291456KB, committed=102400KB) // 6G, 100M
- Class (reserved=1050093KB, committed=11245KB) // 1G, 10M
- Thread (reserved=40151KB, committed=2999KB) // 40M, 3M
- Code (reserved=248968KB, committed=19020KB) // 248M, 19M
- GC (reserved=8427099KB, committed=71259KB) // 8G, 70M
- Compiler (reserved=220KB, committed=220KB) // 0.2M
- Internal (reserved=317KB, committed=317KB) // 0.3M
- Other (reserved=12KB, committed=12KB) // 0.01M
- Symbol (reserved=17665KB, committed=17665KB) // 17M, 17M
- Native Memory Tracking (reserved=7504KB, committed=7504KB) // 7M, 7M
- Shared class space (reserved=12288KB, committed=11924KB) // 12M, 11M
- Arena Chunk (reserved=186KB, committed=186KB) // 0.1M
- Logging (reserved=5KB, committed=5KB) // 0.005M
- Arguments (reserved=1KB, committed=1KB) // 0.001M
- Module (reserved=290KB, committed=290KB) // 0.2M
- Safepoint (reserved=8KB, committed=8KB) // 0.008M
- Synchronization (reserved=106KB, committed=106KB) // 0.1M
- Metaspace (reserved=65803KB, committed=61387KB) // 65M, 60M

So, using the default setup our heap size is equal to 1/4 of RAM, Metaspace has a constant value since it’s the same app with the same amount of classes. Other things (GC, threads, etc.) depend on the chosen GC implementation (or have so small size that were ignored) varying from 14.1% to 27,3%.

Let’s assume, that we going to leave 30% of RAM for the potential growth of our GC part (I will grow during the application usage for sure). Since we have a Linux system — we don’t really care about free RAM for OS but if you are using Windows as a Java server (I hope not so) — don’t forget to reserve 25% extra RAM for OS.

From the proportions above we may assume, that having 500MB of RAM and 60MB of Metaspace we can have the next division:

ZGC:
Free space: 132MB
GC + other: 140MB
Heap: 168MB

COMMENT: so, having the fastest GC on this machine we will have only 168MB of heap, and from the start spring boot app may take 100 of them (and we still don’t use this app!). In this case, we should consider increasing the RAM limit.

G1GC:
Free space: 132MB
GC + other: 121MB
Heap: 187MB

COMMENT: it’s slightly more than in ZGC (and potentially we even can reduce free RAM a bit) but the conclusion is the same.

SerialGC:
Free space: 44MB //we don’t expect memory overhead for serial GC so we just left 10% of RAM
GC + other: 72MB
Heap: 324MB

COMMENT: well, it can be considered as a heap limit for a low-loaded application but you should understand that you use the slowest GC ever. Potentially, some lightweight serverless services (AWS as an example) can be more useful and cheap in this case. We will exclude this GC from further experiments. GC1 is pretty close by memory consumption and works with a higher latency so we are going to skip it as well and all the next tests will be run using ZGC. It’s not obviously you case.

Here is an example of VM flags to enable different GC from our DockerFile:

#UseSerialGC:
#ENTRYPOINT ["java", "-XX:+UseSerialGC","-XX:NativeMemoryTracking=summary","-jar", "application.jar"]
#UseG1GC:
#ENTRYPOINT ["java", "-XX:+UseG1GC","-XX:NativeMemoryTracking=summary","-jar", "application.jar"]
#UseZGC:
ENTRYPOINT ["java", "-XX:+UseZGC","-XX:NativeMemoryTracking=summary","-jar", "application.jar"]

Small automation:

We are going to monitor our memory consumption using jcmd calling it via docker exec. We prepared a small python script that calls jcmd once in 5 seconds, parses total memory consumption, heap consumption and GC consumption into a csv file. Later we will be able to see how our resources were loaded during the test time using this file.
Example of script:

import subprocess
import time
import re

# Define the command you want to execute put a docker container ID here
command = "docker exec 72e498ff90ee jcmd 1 VM.native_memory"
# Define regular expressions to extract the desired data
total_committed_pattern = r"Total:.*committed=(\d+)KB"
heap_committed_pattern = r"Java Heap.*committed=(\d+)KB"
gc_committed_pattern = r"GC.*committed=(\d+)KB"
try:
while True:
# Execute the command and capture its output
result = subprocess.run(
command,
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
universal_newlines=True
)
# Check if the command executed successfully
if result.returncode == 0:
# Parse the command output
text = result.stdout
# Find matches in the text using regular expressions
total_committed = re.search(total_committed_pattern, text).group(1)
heap_committed = re.search(heap_committed_pattern, text).group(1)
gc_committed = re.search(gc_committed_pattern, text).group(1)
# Specify the file path to save the output
output_file_path = "monitor.csv"
# Save the parsed output to a file
with open(output_file_path, "a") as output_file:
output_file.write(f"{total_committed};{heap_committed};{gc_committed}\n")
print(f"Command output saved to {output_file_path}")
else:
print("Error executing the command:")
print(result.stderr)
# Wait for the specified interval before running the command again
time.sleep(5)
except KeyboardInterrupt:
print("Loop terminated by user (Ctrl+C)")
except Exception as e:
print(f"An error occurred: {str(e)}")

Increasing the RAM limit and MaxHeapSize

So, what is the RAM limit we want to have before we start to stress test our application?
We would recommend starting from the following numbers:
We are going to have 70% of RAM for the heap (we discover this number as reasonable to pay for).

So, the rest of the RAM should be enough for Metaspace, GC, and other memory consumption categories. Let’s try to calculate it
Current setup:
512–70% (153.6) is not enough to cover other categories;
+256MB:
768–70%(230.4) is enough. So let’s start our experiments from this.

So our new setup is:

Docker-compose

mem_limit: 768m

DockerFile (java -jar section):

-XX:MaxHeapSize=606076928 //538MBed=61705KB) // 65MB, 61MB

Bootstrap

We have a consumption spike on our heap during the bootstrap. Max/min consumption look like this:

Total max: 702,29MB
Total min: 286,41MB
Heap max: 456MB
Heap min: 56MB
GC max: 113MB
GC min: 98,51MB
Other max: 133,24MB
Other min: 131,90MB

Well, we have enough resources to start our application =). Let’s try to add some load. Here and in the next experiments we will have a next load profile:

2.6M requests per month (1 per sec)
read/write requests ratio: 1:1
max items to get: 10k

We will use Postman collections running with delay. In two separate threads (one for POST, another one for GET). If you have needs to make load in multiple threads — you can use Jmeter.

Under load

Total max: 835,1MB
Total min: 341,34MB
Heap max: 538MB
Heap min: 78MB
GC max: 115,3MB
GC min: 102.2MB
Other max: 188,26MB
Other min: 163,13MB

Notes:
- We can see, that our heap size was enough to handle this load but the total memory consumption was higher the maximum on the spikes (meaning, the rest of memory was taken from swap that affects on our performance). It can be a marker to increase a memory limit for our container.
- Our GC memory consummation is pretty stable.

The next container limit will be:

mem_limit: 1g

Bootstrap

Total max: 691,3MB
Total min: 291,93MB
Heap max: 446MB
Heap min: 60MB
GC max: 112,94MB
GC min: 99.79MB
Other max: 132,89MB
Other min: 131,98MB

Under load

Total max: 826,22MB
Total min: 284,25MB
Heap max: 528MB
Heap min: 52MB
GC max: 115,01MB
GC min: 100.24MB
Other max: 193,94MB
Other min: 132MB

Notes:
- 1g RAM and 538MB for heap is enough to handle our load. We also have around 20% free memory and our heap limit was almost reached so for more accurate tuning we may continue increase our max. heap size and replay the experiment.

The same experiment may be replayed with G1GC. If increased latency is not an issue for your case but reduced memory consumption matters — you can switch to G1GC.

Summary

We learned how to calculate memory limits for our containerised java application, discovered few useful tools and approaches for monitoring (there are a lot of them besides these. Please let me know if you want to see articles about some of them). It’s not a overall guide for every concept but may give you a clue what the ideas are behind should be behind your resource limit decision.

Links

Repository with a service code, monitoring script and docker files: https://github.com/sIvanovKonstantyn/frameworks-comparation/tree/javamem

The article from the same series about CPU limitations: https://medium.com/@svosh2/how-to-choose-jvm-and-docker-container-properties-for-our-java-service-a04bb9e2c855

--

--