Apache Spark 1.6.0 setup on Mac OS X Yosemite
Aren’t you thinking why there is one more post on the installation of Apache Spark on Mac OS X ??
If you are think so, then you wouldn’t have had success in following the previous tutorial :P or if this is the first time you are seeing such tutorial like this then, congrats you have come to the right place.
The purpose of me writing is just that I wanna write a post in Medium, because it has been quite a time since I published one.
Ok let gets back to the core topic.
So you would have done this step of Googling for installation steps.
The first link is not really an installation guide, the second one is an installation guide, but its uses a very old version of Spark, the others are basically steps to follow and are not pictorial.
We love screen shots aren’t we ? because we know the saying “A picture says a thousand words.”
So first steps first.
Downloads & Installations
- Brew, a package manager for OS X.
Execute the below command to install the tool.
ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
Mac already has Java. So skipping this part.
using brew we can install Scala.
brew install scala
At the time of writing this tutorial, the Scala version that shall be installed is 2.11.7
- PyCharm, I am a fan of this IDE for writing Python applications. (I will write one more tutorial on how to directly run the Python scripts from IDE on the Spark environment.)
- IntelliJ IDEA, If you want to use the Scala/Java API of Spark.
The community editions of the IDEs are free. If you are student, JetBrains unlock many features. You just have to request using your school email id.
Setting up Spark
I would always like to keep things organized, so lets move the downloaded Spark folder to /usr/local/spark
tar -xvzf spark-1.6.0.tgz # to extract the contents of the archive
mv spark-1.6.0 /usr/local/spark # moves the folder from Downloads to local
Now we have to build the spark that was downloaded, or else we cannot run the programs or examples.
This might take a little time (Actually more than you can think of).
build/sbt clean assembly # builds the spark environment
Testing Spark installation
If you already have a hands on experience with Spark write your own Word count program in the shell else there are few examples that Spark will provide us to get used to the environment.
./bin/spark-shell # Scala shell
./bin/pyspark # Python shell
All the best for building some world class Big data apps.