Spark memory allocation for driver and executor — beginner-friendly

shubham badaya
3 min readJun 7, 2024

--

I have often been asked to debug Spark applications, and sometimes it gets quite complex to explain why someone encounters out-of-memory errors, as it results from a variety of factors.

In this article, we will discuss Spark memory allocation for both the driver and executor, and how understanding these allocations can help in avoiding out-of-memory errors.

Driver Memory Breakdown

Assume, we submitted one spark application to the yarn cluster. the yarn resource manager will allocate a spark application master(AM) and start your driver in the AM,

The driver will start with the memory you just allocated. There are 2 components to the driver memory and the total memory requested for the driver is the sum of both.

1. spark.driver.memory: usually set by the user. Keep to minimum e.g. 1G to 4G.

2. spark.driver.memoryOverhead : default value is max(10% of spark.driver.memory or 384 MB) used by container process or any other non-JVM process.

3. YARN allocates the requested driver memory + the higher value between 10% of requested memory and 384 MB for container overhead.

So in total YARN resource manager will launch a container of size 4G + 0.4 GB = 4.4 GB.

Now we have 3 limits.

1. The spark driver JVM cannot use more than 4 G

2. Your non-JVM process cannot use more than 0.4G

3. The overall container cannot use more than 4.4 GB

If any of them exceeds, we will get an out-of-memory error for the driver.

Executor Memory Breakdown

Now the driver has started and it will request yarn RM for executors. Each executor memory will be the sum of the following 4 components.

Now in our case above for executor, again we have 3 limits

  1. 8G for JVM processes
  2. 0.8G for non-JVM process
  3. container physical limit of 8.8 G

If any of them exceeds, we will get an out-of-memory error for the executor.

There are 2 more things which we need to consider.

  1. What is the physical memory limit at the worker node?

This is usually set by the yarn admin and can be checked on the cluster config page or using the below properties. we can’t have any executor which has more than a physical limit

  • yarn.scheduler.maximum-allocation-mb
  • yarn.nodemanager.resource.memory-mb

2. What is the PySpark executor memory?

We don’t need to worry about Pyspark memory if we write code in Scala or Java. However, if we are working in pyspark, then we need to worry about pyspark memory. Since Python is a non-jvm process, it will consume the overhead memory.

In case of memory overhead error, increase the memory overhead limit or provide pyspark memory separately.

In Summary,

A Spark container’s total memory is divided into two sections:

  1. Heap Memory: This is the main memory for the JVM process running inside the container. It’s called driver memory if the container holds the Spark driver, and executor memory if it holds a Spark executor.
  2. Overhead Memory: This is separate memory managed by the operating system within the container. It’s used for various tasks, including network buffers for data shuffling and reading data from storage.

Both parts are crucial for your Spark application to run smoothly. Insufficient overhead memory, often overlooked, can lead to Out-of-Memory (OOM) exceptions due to its role in network operations which is responsible for shuffling buffer, and reading data from external source like RDBMS.

--

--