From SQLite to PostgreSQL: A Lighthearted Guide to Upgrading Your Airflow Database
Welcome, friends, to the wild and wacky world of Airflow and Postgres! Today, we’re going to take a journey together and learn how to set up Postgres as the backend database for Airflow.
But first, let’s talk about why you might want to switch from SQLite to Postgres. Maybe you’re tired of SQLite’s limitations and want something more robust. Maybe you just want to be cool like all the other kids. Whatever your reason, we’re here to help.
So, without further ado, let’s get started!
Step 1:
Take a deep breath and believe in yourself. You can do this. Are you ready?
Step 2:
Open your Airflow configuration file and locate the executor
setting.
executor = LocalExecutor
Don’t forget to save your changes!
Step 3:
If you haven’t already, create a new database and user in PostgresDB for Airflow. Run the following commands in your PostgresDB query editor:
CREATE DATABASE airflow_db;
CREATE USER airflow_user WITH PASSWORD 'airflow_pass';
GRANT ALL PRIVILEGES ON DATABASE airflow_db TO airflow_user;
Step 4:
Update the sql_alchemy_conn
setting in your airflow.cfg file to point to your PostgresDB database:
sql_alchemy_conn = sql_alchemy_conn = postgresql+psycopg2://airflow_user:airflow_pass@localhost:5432/airflow_db
Note: If you created a new Postgres account for Airflow, the default search_path for the new user is: “$user”, public. No change is needed. However, if you use a current Postgres user with a custom search_path, search_path can be changed by the command:
ALTER USER airflow_user SET search_path = public;
Step 5:
If needed, update your PostgresDB pg_hba.conf to add the airflow user to the database access control list. Then, reload the database configuration to load your changes.
- Open the pg_hba.conf file for editing. The location of the file may vary depending on your operating system and Postgres installation, but it is typically found in the
data
directory of your Postgres installation. - Locate the section of the file that controls access for local users. This section may be labeled “Local connections” or “IPv4 local connections”, depending on the version of Postgres you are using.
- Add a new line to the section that allows access to the
airflow_user
account. The format of the line will depend on the version of Postgres you are using, but it should look something like this:
host all airflow_user 127.0.0.1/32 md5
- Save the pg_hba.conf file and exit the editor.
- Restart the Postgres server to reload the configuration. The method for restarting the server will depend on your operating system and Postgres installation, but it is typically done using a command like
sudo service postgresql restart
.
sudo service postgresql restart
Step 6:
Now it’s time to initialize your Airflow database. This step may take a minute, so sit back and relax. Run the following command:
airflow init db
Step 7:
Create a new admin user to access your new database. Let’s personalize this step by creating a user with your own name! Use the following command to create a user with your information:
airflow users create --username admin --firstname FirstName --lastname LastName --role Admin --email dummy@xyz.com --password admin
Step 8:
Are you ready to see your new database in action? Start the Airflow webserver with this command:
airflow webserver
Step 9:
Start the Airflow scheduler. This is where the magic happens! Run the following command:
airflow scheduler
Step 10:
Open your web browser and go to http://localhost:8080/. This is the Airflow UI, where you can see all of your tasks and workflows. Log in using the credentials you just created, and voila! Your Airflow database is up and running on PostgresDB.
And there you have it! You’ve successfully migrated your Airflow database to PostgresDB and can now easily manage your tasks and workflows with the Airflow UI.