Set up a local Spark cluster step by step in 10 minutes
Set up a local Spark cluster with one master node and one worker node in Ubuntu from scratch completely, and for free.
This is an action list to install the open-sourced Spark master(or driver) and worker in local Ubuntu completely for free. (in contrast to Databricks for $$$)
The following setup runs in a home intranet. On one Linux(Ubuntu) physical machine(Jetson Nano) and one WSL2(Ubuntu) inside of Windows 10.
Step 1. Prepare environment
Make sure you have Java installed
sudo apt install openjdk-8-jdk
Check if you get Java installed
java -version
If you are going to use PySpark, go get Python installed
sudo apt install python3
Check if you get Python installed
python3 --version
Step 2. Download and install Spark in the Driver machine
From the Spark download page, select your version, I select the newest. (in any directory)
curl -O…