Apache Spark — Cluster with sufficient resources does not accept new jobs

Karthik Jayaraman
1 min readJul 28, 2019

--

This blog explains how to handle an issue that I have run into every so often — a Spark cluster that has more than sufficient executors, memory and CPU but refuses to start running any new job submitted to it.

Instead, any attempt to submit a new job with spark-submit returns the error below.

WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources”

The issue appears to be related to the dynamic resource allocation setting in Spark and can be fixed by one of two methods.

Turn off Dynamic Resource Allocation

conf = (SparkConf()
.setAppName("my_job_name")
.set("spark.shuffle.service.enabled", "false")
.set("spark.dynamicAllocation.enabled", "false")

or add the below parameter to the spark-submit command line invocation.

spark-submit --conf spark.dynamicAllocation.enabled=false

Specify the driver and executor settings explicitly

spark-submit --master yarn --deploy-mode client \
--driver-memory 5g --executor-memory 5g \
--num-executors 3 --executor-cores 2 script.py

The above two methods have consistently enabled me to overcome the issue where a cluster with seemingly sufficient resources refuses to accept a new Spark job.

--

--

Responses (1)