Installing Apache Spark (PySpark): The missing “quick start” guide for Windows

Lauren Oldja
4 min readJan 28, 2018

So you saw the latest Stack Overflow chart of popularity of new languages, and — deciding maybe there’s something to this “big data” trend after all — you feel it’s time to get familiar with Apache Spark.

Sure, you could get up and running with a few keystrokes on UNIX/MacOS, but what if all you have at home is an old Windows laptop? I tried following the installation instructions from the O’Reilly book Learning Spark (which, like many wonderful tech reference materials, may be available for free from your local library), but the chapter is a bit sparse on details for Windows users and just didn’t work “out of the box” for me. Instead, the following is based on the official Quick Start Guide, trial and error, and lots of Googling.

This guide assumes the following:

You’re on a Windows 8.1 Pro system. Similarly-old versions of Windows would probably also be similar. Windows 10 users might want to check out its Linux Subsystem support instead. (let me know in the comments!)

--

--