The following tutorial will walk you through the following tasks:
- Download PostgreSQL and pgAdmin 4 to your own PC or environment.
- Connect to PostgreSQL server
- Download and load the Pagila database set (based on the famous Sakila training database).
- Provide resources and a link to a Github repository detailing these steps. The data and the question sets will be included.
As I quickly worked through many different methods for teaching myself SQL I began to notice that very few of them truly embody the real-world feel of using the database querying language in the wild. I also felt a longing to understand the deeper relationships between SQL and whichever database management system (DBMS) it was being used in. This lead me to the thought that certainly there is a dataset out there somewhere that will allow me to create a real-world environment to practice in. To my joy, there was. Sakila was created to train people in MySQL and shortly afterward Pagila was created to train new users in PostgreSQL. I will go over later where I found and how I used the Pagila data. You can also find a link to it under the movie film image below.
The Pagila Database is a simulated set of tables that would mimic the operations and data of a chain of video stores. Think of Blockbuster Video circa 1990’s. At first glance the use case of a video store may not seem very relevant in a post-Blockbuster world yet I assure you that this dataset still does relatively well mimicking many retail environments or environments where there is revenue, customers, and products.
1. Let’s Get Started: Downloading PostgreSQL
Download PostgreSQL . You can do this a number of ways (all of which are featured in the link). I chose to use the interactive installer by EDB but there may be some options that are easier and in fact better. If I took the time to do this all over again I would likely choose the postgres.app instead. The postgres.app install not only allows for a simple download and install but may even have a more intuitive GUI than pgAdmin 4. nonetheless, pgAdmin works just fine and if you’re into command line usage I believe all of these implementations also allow SQL on CLI.
- I’m using Mac so I downloaded the installer from EDB. Installer will save an image to your computer that will open up a folder.
- Click on the folder and an installer will pop up. Follow the prompts.
During the installation a prompt will come up asking you to select your locale. This is where you set up cultural preferences including language. It is important to select the right locale. If you’re in the U.S. for instance you will select en_US.UTF-8.
- The installer will then show you a summary of your install configurations.
- Click Next to install.
2. Setting up Postgre SQL Server
There is a really nice tutorial walking you through Postgre SQL Server setup located here: Connect to a PostgreSQL Server Database. OR you can follow my quick and less thorough instructions below.
Two Ways to Connect on Mac (App-based or Terminal-based)
Terminal Based Server Connection (Using psql)
One way to connect, especially if you love using command line, is to use the psql SQL shell.
- For terminal based you will need to open up the psql app that should have been downloaded as part of the PostgreSQL package in the previous steps. This will open up a SQL Shell. On mac I just type “psql” in the search bar or you can go to the application folder and find it there.
- Then you will enter Server name, Database name, Port, your username and your password. You can also press the Enter key and default values will be filled in. The default values are displayed in brackets next to each prompt.
- Lastly, you can interact directly with the PostgreSQL Database Server by writing a SQL statement. The below statement shows all of your databases in PostgreSQL. There is usually just a template database in there to start.
Application Based Server Connection (Using pgAdmin 4)
You can also use pgAdmin 4 web-app. This allows you to connect and interact with PostgreSQL database server via a user-friendly GUI.
- Launch the pgAdmin application. Again, I just used the search bar in my mac but you can also find it via command line and in the applications folder.
- One tutorial I had seen said that pgAdmin will open in a web-browser but when I opened it pgAdmin actually opened in its own app on the Mac dock.
3. Right click on “Servers.” and select Create > Server
4. Enter server name (whatever you want it to be)
5. Click the connection tab and enter host and password. This is the username and password you created in the PostgreSQL installation process.
6. Expand the servers cluster and then the databases node and you will see a default database called postgres already installed. Ignore the Pagila database in the image below. I had already downloaded it so it shows an unconnected database there.
Once you’ve navigated to the desired database you can click on it and then click on the query tool. You can do this two ways. Clicking on the “Tools” drop down and then the “Query Tool” option or by clicking on the icon that looks like databases with a “play” logo.
7. Enter a query into the editor and click the execute button
3. Download and Load The Pagila Database
There are two good githubs that I found which house the Pagila Database. I used the Pagila Database by Devrim Gündüz. It was the first one I found that had the database and clear instructions on what it was and how it works. If you are into using containerized environments he also has a tutorial for implementing the database in Docker. In this process I am assuming you already have a github account and know how to use git. If you do not then you can create a github account here and familiarize yourself with github tutorials.
copy the https or ssh link from the repository to clone it
You can see in the image that the repository address in the top left of the screen is now my repository. From your forked repository you will clone the repository down to your PC. Click on code and copy the https or SSH url.
Start a new command line terminal. You can do this by typing terminal in the search bar on Mac or In the Finder you can open the /Applications/Utilities folder, then double-click Terminal.
Once terminal is open you can navigate over to the folder/directory where you would like to store the Pagila directory. Learn how to navigate through command line here: Navigate Through Command Line. You will then copy the link into the command line within the desired directory and a new directory will be created that houses all of the pagila database files that were housed in github.
At this point you will now have a Pagila directory that houses the Pagila databases on your own computer. We will not pull the Pagila files into PostgreSQL server using pgAdmin 4. Before doing this I strongly recommend reading the README.md file in Devrim Gündüz’s Pagila github repository.
Create a New Database in pgAdmin 4
At this point a new database should show up under the Databases heading in the left broswer. Click on the new database (in my case it is test_pagila), then click on the query tool, go into the new query editor and find the file icon in the top left of the editor and then click on that to search for your Pagila SQL files.
Now your database has been created. You can run queries and use this for whatever purposes you now desire. Happy querying!
4. Resources for Using Pagila
There are many ways to train with The Pagila Database. You can spin up your own queries and use it any way you would use a real-world database or you can use question sets that quiz you on your abilities to query using realistic data and data-structures. This repository from Robert Treat is extremely helpful. It contains Pagila setup info and files as well as many links to Pagila tutorials to various tasks such as building applications from Pagila data, using REST API, performing email validation, creating reports and building rails to legacy apps.
I have also scoured the internet to find all kinds of questions (with solutions in most cases) pertaining to Pagila and Sakila and compiled them into a group of Jupyter Notebooks. You can use these questions to test your abilities as well. Find them here:
I also created an Entity Relationship Diagram (ERD) using Pagila. You can find my tutorial for this here: Creating a SQL Entity Relationship Diagram (ERD) and the actual ERD image for your reference below:
Also, I garnered much of my information to perform this task from these sites:
- Devrim Gündüz Github Devrim is a major contributor to PostgreSQL. This is the database file I used in this tutorial.
- The Pagila Project Page on Github I found this one after I already started creating this tutorial but this is possibly the creator of Pagila. I’m not sure. But the creator of this project seems to be a major open-source contributor to a variety of projects.
- Connect to a PostgreSQL Database This is the resource I used to connect to the PostgreSQL database. This site Postgresqltutorial.com is also a helpful resource for other PostgreSQL information.
- PostgresSQL Official Site This is where I downloaded PostgresSQL.
- Getting Started with PostgreSQL using Amazon RDS, CloudFormation, pgAdmin, and Python This is a resource I didn’t use but something I want to use at some point. This tutorial combines PostgreSQL with Amazon AWS services to create a dual purpose learning experience.
I hope you enjoyed this tutorial and it helped you get started with Postgres and Pagila. If you have any questions or spot anything that needs clarification please message me on Medium. You can also find much of this information on Github: SQL_Training_with_Pagila by James Opacich