Setting up Airflow on WSL: Breathe Easy with Data Workflows (No Docker Required!)

Akash Gupta
Plumbers Of Data Science
4 min readApr 17, 2023

So, you want to use Airflow on your Windows machine but don’t want to go through the hassle of setting up a separate Linux server? Fear not, my friend! With WSL, you can easily install and configure Airflow in just a few simple steps.

Prerequisites

Before we begin, make sure you have the following prerequisites installed on your system:

If you’re unsure whether you have these prerequisites installed, you can check by running the following commands in your terminal:

python3 --version
pip3 --version

If any of these commands return an error, you’ll need to install the missing prerequisite before continuing.

Step 1: Get your fingers ready

First things first, make sure you have all your fingers ready to type because we’re about to do some serious command-line action.

Step 2: Update and upgrade

Before we get started, let’s make sure your system is up-to-date. Open up your terminal or WSL shell and run the following command:

sudo apt update && sudo apt upgrade

This will ensure that your system is up-to-date and ready to install Airflow.

Step 3: Create Airflow Home

Next, we need to create a directory where Airflow can store its files. For example, let’s create a directory called AirflowHome in the /c/Users/username directory. You can create this directory by running the following command:

mkdir /c/Users/username/AirflowHome

Don’t forget to replace username with your actual Windows username.

Step 4: Install Airflow

To install Airflow, simply run the following command in your terminal:

pip3 install apache-airflow

This will install the latest version of Airflow and all its dependencies.

Step 5: Set the AIRFLOW_HOME environment variable

Now that we have our AirflowHome directory, let's set an environment variable called AIRFLOW_HOME to point to this directory. To do this, open up your terminal and type:

nano ~/.bashrc

This will open up a text editor where you can define environment variables. Add the following line at the end of the file:

export AIRFLOW_HOME=/c/Users/username/AirflowHome

Don’t forget to replace username with your actual Windows username.

Save the changes by pressing Ctrl+X, then Y to confirm, and finally Enter.

Step 6: Make directories great again

By default, WSL uses the /mnt directory to access your Windows files, which can be a bit annoying. So, let's make things easier by adding the following lines to /etc/wsl.conf:

sudo nano /etc/wsl.conf

Then add the following lines:

[automount] 
root = /
options = "metadata"

Save the changes by pressing Ctrl+X, then Y to confirm, and finally Enter.

Step 7: Close and open

Close your terminal and open it again to make sure everything is refreshed.

Step 8: Install missing packages

If you run into any missing package errors, just use pip3 to install them:

pip3 install [package-name]

Step 9: You made it!

Congrats! You’ve successfully installed Airflow on WSL. To make sure everything is working, run the following command:

airflow info

If you see something like Apache Airflow [2.x.x], you’re good to go!

Step 10: Initialize Airflow database

Before we can use Airflow, we need to initialize its database. Run the following command in your terminal:

airflow db init

This will create the necessary tables and structures for Airflow to store and manage its data.

Step 11: Start Airflow

Finally, we’re ready to start Airflow. Open up your terminal and run the following command:

airflow webserver

This will start the Airflow web server and make it available at http://localhost:8080. Open up your web browser and navigate to this URL to access the Airflow web interface.

Congratulations, you’ve successfully set up Airflow on your WSL environment! Now, you can use Airflow to manage your data workflows and pipelines.

But wait, there’s more!

Bonus: Scheduler and Command Line Interface (CLI)

Now that you have Airflow up and running, let’s take a quick look at some additional features. You can start the Airflow scheduler by running the following command in a new terminal:

airflow scheduler

The scheduler is responsible for triggering your DAGs and running your tasks according to their schedule. With the scheduler running, you can use the Airflow CLI to interact with your DAGs and tasks. For example, you can list your DAGs by running:

airflow list_dags

This will display a list of all the DAGs you’ve defined in your dags_folder.

Step 12: Get creative

Now that you’ve set up Airflow, it’s time to get creative! Use Airflow’s Python API or web UI to create and manage your workflows and tasks. And don’t forget to have fun with it!

Conclusion

Setting up Airflow on WSL doesn’t have to be a tedious task. With just a few simple commands, you can easily install and configure Airflow on your Windows machine. So, grab a coffee (or your favorite beverage), get your fingers ready, and start creating some awesome workflows with Airflow!

--

--

Akash Gupta
Plumbers Of Data Science

Data Engineering with a Sense of Humor: ओ bug कल आना!