EXPEDIA GROUP TECHNOLOGY — SOFTWARE

Part 4: How to Migrate Existing Apache Spark Jobs to Cost Efficient Executor Configurations

Steps to follow when converting existing jobs to cost efficient config

Brad Caffey
Expedia Group Technology

--

An image representing a migration service key on a keyboard
Photo by kenary830 on Shutterstock

If you like this blog, please check out my companion blogs about how to use persist and coalesce effectively to lower job cost and improve performance.

There are a number of things to keep in mind as you tune your Spark jobs. The following sections cover the most important ones.

Which jobs should be tuned first?

With many jobs to tune, you may be wondering which jobs to prioritize tuning first. Jobs that only have one or two cores per executor make a great candidate for conversion. Also, jobs that have 3000 or more Spark core minutes also make good candidates. I define a Spark core minute as…

Executor count * cores per executor * run time (in minutes) = Spark core minutes

Make sure you are comparing apples to apples

When converting an existing job to an efficient executor configuration, you will need to change your executor count whenever your executor core count changes. For example, if you change your executor cores from 2 to 5, then you will need to change your executor count to maintain the same number of Spark cores. For example…

Old configuration (100 Spark cores): num-executor=50 executor-cores=2

New configuration (100 Spark cores): num-executor=20 executor-cores=5

The reason to do this is so that you have the same processing power (i.e. Spark cores) with the new config as with the old config.

Run times may slow down

When converting jobs to a cost efficient executor, you may find that sometimes your process will slow down with the new config. When this happens for you, do not worry. Your new configuration is running cheaper and there are additional tuning steps you can take to improve performance in a cost efficient manner.

Here is an example of a job I converted from its original inefficient configuration to an efficient configuration, and then the steps I took afterward to improve the run time at a low cost. Note that while the initial step is to make execution cheaper by utilizing the same number of cores on fewer nodes, this makes the run time slower. Then I increased the number of nodes while keeping all cores fully utilized. Ultimately this leads to a quicker run time and a lower cost!

Tuning stages: change config to use all cores, reducing node count; increase nodes keeping them fully utilized until optimal

Maybe slower but cheaper is better

The above is a good example of how you can reclaim performance while using a cost efficient executor. But sometimes a dramatic improvement in run time can’t be had when switching to a cheaper but slower cost efficient configuration. If you are in a highly cost conscious environment, you might want to revisit your SLAs with supporting teams to see if longer run times are acceptable for the sake of reduced cloud spending.

One team at Expedia Group™ switched delivering data from every 4 hours a day to delivering data every 6 hours a day in an effort to reduce costs. Another team had a dedicated but very costly daily process for one product that had its frequency reduced from daily runs to just twice a week for huge savings for the company.

Dynamic Allocation

This is a quick recommendation — disable dynamic allocation when running tests on newly converted jobs, because dynamic allocation will behave differently based on the strength of your executor. You can re-enable it at the end of the tuning process.

Sometimes over-utilization works

There are cases where over-utilizing your node CPUs actually improves performance. These situations typically involve jobs that have a lot of CPU wait time. For example, when loading data into Cassandra or Elastic Search, your job will have a lot of IO waits and therefore will benefit if you over-utilize your node CPUs by allocating more Spark cores. There are two choices available when you need to over allocate:

  1. Increase the core count in your executor
  2. Decrease your memory size so that an additional executor fits on your node.

I recommend increasing your core count until you reach 5 and then once that limit is reached decreasing memory to an amount that allows an additional executor on your node.

Switch to larger EC2 instance type

In some jobs I tuned, I found that the nodes on the cluster had too tight of a ratio between memory and CPUs which would hamper performance/cost for that job. For instance, the majority of nodes I’ve seen data engineering teams use at Expedia Group have had 8GB per node CPU (128GB/16 CPUs or 64GB/8 cpus). However, in some cases, I found some teams using nodes only 4GB per node CPU (64GB/16 CPUs or 32GB/8 CPUs). And some of their jobs benefited from running on nodes with larger memory to CPU ratio.

While the nodes with tighter ratios may be 25% cheaper, that cost savings is squandered if you are not utilizing all the available node CPUs on the tighter nodes. If you are not utilizing all of your node CPUs, then you should consider running on a cluster with looser memory-to-CPU ratio. Also, if CPU utilization and memory utilization are both 100% then memory typically becomes a bottleneck in this scenario and therefore switching to a looser node will result in jobs that typically run faster and cheaper.

How to tell if your configuration is working properly

But how do you know if your new config is working or not for your job? Believe it or not…your new config could have problems processing your data even when your job completes successfully. Sometimes when switching to an efficient executor config, memory issues will happen in your executor. If/When this happens, your executor is not processing data efficiently and needs further tuning. You can see if your executor is not working effectively by checking to see if you have any failed tasks during your Spark run.

The below screenshot was taken from a Spark job that completed successfully. If the data engineer managing this job didn’t check the Spark log, they would not have known about the failed tasks that made this job run longer. When you have failed tasks, you need to research what is causing these failures to keep them from happening in the future.

Screenshot of spark job that completed successfully

The next part of this series will cover how to research failed tasks and which common memory issues to watch out for when tuning for cloud spending efficiency.

Series contents

Learn more about technology at Expedia Group

--

--