All Spark executors in AWS Glue died, but its status is RUNNING

Photo by Julius Drost on Unsplash


All Spark Executors died

AWS Glue lost all Spark Executors

No Alerts from CloudWatch Alarm

AWS Glue status doesn’t tell the exact status

No meaning Information in Log in CloudWatch

Spark History Server

In Spark History Server, failed stage was shown
Details about the failed Stage

So far, the issues I’ve faced in AWS Glue

  • Executors died due to “Disk Space issue” although there is plenty of disk space.
  • Job failed because Executors were not able to connect MSK. ( Timeout issue ) — MSK is not busy. It’s ideal.
  • Job failed because KafkaConsumerPool in Spark reached its soft max size.


  • You still have to deal with the low level issue in Spark or Spark Cluster. This mean you need a decent knowledge about Spark Cluster as well.
  • You have to deal with the low level issue without having the control of changing / updating Spark configurations because AWS Glue is a managed service and it limits the configuration you might have to update.




Gatsby Lee | Data Engineer | City Farmer | Philosopher

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Mapping the World with Hexagons

Change Flutter app theme based on device settings

Nvidia For Mac Os X

How to Auto Mark All Evening Appointments as Private in Your Outlook

How to discover Universal Plug and Play (UPnP) hosts using Miranda

Understanding AWS

Savouring queues

AWS configure for Docker Containers with Jenkins Pipeline.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


Gatsby Lee | Data Engineer | City Farmer | Philosopher

More from Medium

Auth between Amazon MSK + Glue

Interacting Redshift using EMR on EKS

EMR YARN Node Labels — For effective driver and executor placement

Setup AWS Glue locally using PyCharm CE / Visual Studio Code