Meenakshi Sundaram Sekar
4 min readApr 20, 2018

--

Anatomy of a Spark Application — in a nutshell

Spark application contains several components, all of which exists whether you are running spark on single machine or across a cluster of hundreds or thousands of nodes.

The components of the spark application are Driver, the Master, the Cluster Manager and the Executors. I will explain the components in following sections.

All of the spark components including the driver, master, executor processes run in java virtual machines(Jvms). A JVM is a cross platform runtime engine that an execute the instructions compiled into java bytecode. Scala, which spark is written in, compiles into bytecode and runs on JVMS.

Spark Driver:

The life of Spark programs starts and ends with the Spark Driver. The Spark driver is the process which the clients used to submit the spark program. The Driver is also responsible for application planning and execution of the spark program and returning the status/results to the client.

The Spark Context:

The Spark context is applications Instance created by the Spark driver for each individual Spark programs when it is first submitted by the user.

The Spark context throughout entirety of a spark application, is an instance of Spark driver for connecting to the Spark Master and Spark Workers. Usually referred to as variable…

--

--

Meenakshi Sundaram Sekar

Senior AWS Cloud ETL Developer — Distributed data processing — Spark,Mapreduce and Hadoop ecosystems.