Apache Spark 1.6.0 setup on Mac OS X Yosemite

Aren’t you thinking why there is one more post on the installation of Apache Spark on Mac OS X ??

If you are think so, then you wouldn’t have had success in following the previous tutorial :P or if this is the first time you are seeing such tutorial like this then, congrats you have come to the right place.

The purpose of me writing is just that I wanna write a post in Medium, because it has been quite a time since I published one.

Ok let gets back to the core topic.

So you would have done this step of Googling for installation steps.

You did this right ?

The first link is not really an installation guide, the second one is an installation guide, but its uses a very old version of Spark, the others are basically steps to follow and are not pictorial.

We love screen shots aren’t we ? because we know the saying “A picture says a thousand words.”

So first steps first.

Downloads & Installations

  • Brew, a package manager for OS X.

Execute the below command to install the tool.

ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
  • Java

Mac already has Java. So skipping this part.

  • Scala

using brew we can install Scala.

brew install scala

At the time of writing this tutorial, the Scala version that shall be installed is 2.11.7

  • PyCharm, I am a fan of this IDE for writing Python applications. (I will write one more tutorial on how to directly run the Python scripts from IDE on the Spark environment.)

The community editions of the IDEs are free. If you are student, JetBrains unlock many features. You just have to request using your school email id.


Setting up Spark

I would always like to keep things organized, so lets move the downloaded Spark folder to /usr/local/spark

tar -xvzf spark-1.6.0.tgz # to extract the contents of the archive
mv spark-1.6.0 /usr/local/spark # moves the folder from Downloads to local
cd /usr/local/spark

Now we have to build the spark that was downloaded, or else we cannot run the programs or examples.

This might take a little time (Actually more than you can think of).

build/sbt clean assembly # builds the spark environment

Testing Spark installation

If you already have a hands on experience with Spark write your own Word count program in the shell else there are few examples that Spark will provide us to get used to the environment.

Shells :

./bin/spark-shell # Scala shell
OR
./bin/pyspark # Python shell

Pi example:

./bin/run-example org.apache.spark.examples.SparkPi

All the best for building some world class Big data apps.

Show your support

Clapping shows how much you appreciated Prady’s story.