Install Airflow on Windows without Docker or Virtual Box in 5 mins
Apache Airflow is an open-source platform used for orchestrating complex data workflows. It offers a powerful toolset for managing, scheduling, and monitoring workflows in a distributed environment. If you are a data engineer, scientist, or analyst, chances are you have heard of Airflow and its benefits. While installing Airflow on a Linux or macOS machine is relatively straightforward, it can be challenging to set up on Windows. Fortunately, with the Windows Subsystem for Linux (WSL), you can run Airflow on a Windows machine without any issues. In this blog post, we will guide you through installing Airflow locally on Windows using WSL. Let’s get Started!
We will use WSL to install the Ubuntu distribution package on our machine. Please go through the below steps to install WSL, Ubuntu, and Airflow.
Step 1:- Search for Turn Windows Features On/Off
Click on Search Bar → Search for Turn Windows Features On/Off and open it.
Step 2:- Check the Windows Subsystem for Linux
Make sure Windows Subsystem in Linux is checked in.
Step 3:- Installing WSL
Open CMD (Command Prompt/Terminal) → Run
wsl --update
Step 4:- Install Ubuntu Distribution
→ Run
wsl --install -d ubuntu
In case you get some error in this step as shown below or any other error related to installation:
Error 1:-
Please click on the below link to solve this Issue
Error 2:-
Please click on the below link to solve this Issue
https://medium.com/@routr5953/fix-ubuntu-on-wsl-that-failed-to-boot-after-reinstalling-d8041450ab71
Step 5:- Configure Ubuntu
Once installed you can see a new prompt for the ubuntu app
→ Enter a Username for your ubuntu machine
→ Enter Password
You are now successfully logged In
Step 6:- Accessing Root User
→ Run
sudo su
We have accessed the Root user
Step 7:- Update and Install the packages
→ Run
apt-get update
→ Run
sudo apt install python3-pip
Press ‘Y’ to continue the installation
We have to install a virtual environment to install our packages inside it.
→ Run
pip install virtualenv
Step 8:- Change User from root to your user
→ Run
su "username"
Example :- su rajesh
Step 9:- Create a virtual environment
→ Run
virtualenv airflow_env
You should be able to see one folder with the name ‘airflow_env’
→ ( to display files and folders)
ls
Step 10:- Create a folder called ‘airflow’
This will be used to store our airflow project files
→ Run
mkdir airflow
→ ( To display all files and folders)
ls
Step 11:- Now activate your virtual env
→ Run
source airflow_env/bin/activate
Here you can see ( airflow_env ) which means our virtual env. “airflow_env” is activated.
Step 12:- Installing airflow
Here we are installing the newest version of Airflow. If you have any specific requirements for the Airflow Version then use the command:-
- pip install apache-airflow== version (version = 2.2.3 etc)
→ Run
pip install apache-airflow==2.5.1
(We are installing airflow 2.5.1 )
Step 13:- Configure Airflow Files
Run the below Commands to configure the Airflow Database and its project files.
a) Set the AIRFLOW_HOME environment variable with a folder name for our Airflow Project.
export AIRFLOW_HOME=~/airflow
Change to “airflow” project directory
cd airflow
b) Initialize Airflow Database
airflow db migrate # Earlier it was airflow db init, now its not supported
c) Create a User for our Airflow UI with Admin Role
airflow users create --username <username> --firstname <firstname>
--lastname <lastname> --role Admin --password <password> --email <email>
Ex:-
airflow users create --username Rajesh --firstname Rajesh --lastname Rout
--role Admin --password Airflow --email routr5953@gmail.com
d) Create a DAGS and PLUGINS folder in the same directory, which will be used to keep our DAGS and plugins files.
mkdir dags plugins
Step 14: Let’s run our Airflow Webserver and Scheduler
To start our Airflow webserver and scheduler, we have to run the below commands:
- Airflow Webserver:-
nohup airflow webserver -p 8080 >> airflow_webserver.out &
- Airflow Scheduler:-
nohup airflow scheduler >> airflow_scheduler.out &
We have used the nohup utility, which is a command on Linux systems that keeps processes running even after exiting the shell or terminal. You can remove the nohup command if you don't need it.
Without the nohup command :
- Webserver:-
airflow webserver -p 8080
- Scheduler:-
airflow scheduler
Hurray! Now we can see our Airflow UI with some example Dags created by Airflow.
I hope this blog helped you install Airflow on your system. In case you are still stuck somewhere, then do email me at my email id:- routr5953@gmail.com and I will get back to you.
In the next Blog, I will integrate Visual Studio with our WSL, which will help you code more efficiently without going to this boring interface of WSL.