Install Spark on Mac (PySpark)

Michael Galarnyk
Jan 3, 2017 · 2 min read

The video above demonstrates one way to install Spark (PySpark) on Mac. The following instructions guide you through the installation process. You can either leave a comment here or leave me a comment on youtube (please subscribe if you can) if you have any questions!

Prerequisites: Anaconda. If you already have anaconda installed, skip to step 2.

  1. Download and install Anaconda. If you need help, please see this tutorial.
  2. Go to the Apache Spark website (link)
Image for post
Image for post

a) Choose a Spark release

b) Choose a package type

c) Choose a download type: (Direct Download)

d) Download Spark

2. Make sure you have java installed on your machine.

3. Go to your home directory (command in bold below)

cd ~

4. Unzip the folder in your home directory using the following command.

tar -zxvf spark-1.6.0-bin-hadoop2.6.tgz

5. Use the following command to see that you have a .bash_profile

ls -a

6. Next, we will edit our .bash_profile so we can open a spark notebook in any directory.

nano .bash_profile

7. Don’t remove anything in your .bash_profile. Only add the following

Notes: The PYSPARK_DRIVER_PYTHON parameter and the PYSPARK_DRIVER_PYTHON_OPTS parameter are used to launch the PySpark shell in Jupyter Notebook. The — master parameter is used for setting the master node address. Here we launch Spark locally on 2 cores for local testing.

8. Type the following into your terminal

source .bash_profile

Please let me know if you have any questions. You can also test your PySpark installation here!

Common issues: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable

While the native hadoop library is supported on Linux type platforms only (The library does not to work with Cygwin or the Mac OS X platform), we have to make a workaround if it effects pyspark from working.

  1. Download hadoop binary (link, basically another file) and put it in your home directory

(you can choose a different hadoop version if you like and change the next steps accordingly)

2. Unzip the folder in your home directory using the following command.

tar -zxvf hadoop-2.8.0.tar.gz

3. Now add export HADOOP_HOME=~/hadoop-2.8.0 to your bash_profile. Open a new terminal and try again.

Concluding Remarks

Please let me know if you have any questions! I am happy to answer questions in the comments section below or on the youtube video page, or through Twitter.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store