Run the Parch & Posey DB Locally in Postgres

gauravinthevalley
8 min readJan 31, 2020

--

Currently, I’m enrolled in the Bertelsmann Tech Scholarship Challenge Course — Data Track Nanodegree Program in Udacity. I’ve detailed my thoughts thus far about this course here. One of the topics we’re taught in this program is SQL — they start you out with basic SQL and get you writing complex queries in no time. If you already know some SQL (but haven’t found yourself writing any of late), this is a good refresher of a course; if you don’t already know some SQL, this is a great course to get started learning SQL, among other things like Statistics and Python

This course primarily uses the Database (DB) of fictional company called Parch & Posey to teach SQL; which also means that you will be using this DB to work out most of the quizzes. All the quizzes are interactive — they provide you with a workspace within which you can run your queries and view all the tables and their columns in the DB. However, if like me, you also wish to run the Parch & Posey DB locally in your machine, I’ve outlined the process I went through to do so below.

Download and Install Postgres

First off, go here to download Postgres. Make sure you are downloading the pre-built binaries for your operating system. For Mac users, I recommend using Homebrew to do download and install Postgres. For Linux users, depending on the flavor of Linux you are using, I’m sure the package manager that comes bundled with your flavor can install Postgres without much fuss. For Ubuntu 18.04 for instance, this is a pretty good guide.

Play with Postgres

Again, I’d like to point you to the excellent Digital Ocean tutorial from above. Login to your local Postgres installation via the terminal, create a user (this is an important step as you will need this later), a DB, etc. all without leaving the comfort of your command line 😃Play with Postgres for a bit before downloading and installation a GUI tool to access Postgres.

Postgres GUI

This part is purely optional, of course. If you like your command line better than anything, there’s obviously no need to to follow these steps. Else, I’d recommend the fantastic pgAdmin. They bill themselves thus:

“pgAdmin is the most popular and feature rich Open Source administration and development platform for PostgreSQL, the most advanced Open Source database in the world.”

Go here to download pgAdmin for your operating system.

Run pgAdmin

Once you install pgAdmin and run it, this is what you should get:

Starting pgAdmin4 server on a Mac
pgAdmin — Starting Up

After a few seconds, pgAdmin ought to load up in your default browser like so:

pgAdmin — Enter Password

Here, you can simply reset the master password for pgAdmin and remember it (save it in the browser, etc.) for later.

pgAdmin Interface

Okay, so once you enter the pgAdmin interface after having logged into it, you should see a screen like so:

pgAdmin — Interface

Now is when you connect to your locally installed Postgres DB server.

Connect to Localhost

To connect to your installed Postgres DB server in your Localhost, you need to create a server, like so:

pgAdmin — Create Server

After that, you get a pop-up/modal/dialog like the one below:

pgAdmin Create Server pop-up
pgAdmin — Create Server General

In the “name” field, you can enter any name you desire. I put “localhost”. Ignore that error displayed toward the bottom of the pop-up for now. Go to the “Connection” tab, and enter the “Host” , “Username”, and “Password” fields like I have done (obviously, your “Username” will surely be different):

pgAdmin Create Server Connection
pgAdmin — Create Server Connection

Host has to be “localhost” or the loopback address of “127.0.0.1”. Once you enter one of these values, the error from above will disappear. You should enter the password you created for the superuser username if you followed the Digital Ocean tutorial I mentioned above.

If you created another user (who is not a superuser), that should be fine as well. Once you hit “Save”, you will be presented with every DB functionality to which the user you entered above has access to on your Localhost.

Time to go grab a cup of coffee 🍮 What say?

Create the Parch & Posey DB

Still with me? Great! Let’s move on along.

If you’re connected to your localhost, proceed to create your database by right-clicking on Databases:

pgAdmin Create Database
pgAdmin — Right-Click to Create Database

Okay, you’re now presented with the following pop-up:

pgAdmin — Create Database
pgAdmin — Create Database

Hit the “Save” button, and you now have the mighty parchandposey DB created in your localhost. Albeit, a bit empty. Let’s create the 5 Tables and import all the data, shall we?

Create Region Table

There may be various ways of doing the Region Table. I decided to create it the old-school way — using the SQL CREATE TABLE statement. First though, let’s go to the Query Tool:

pgadmin Query Tool
pgAdmin — Right-Click on parchandposey

As you see on the image above, right-click on parchandpsey and click on “Query Tool” to bring it up and enter the SQL to create Regions Table like so:

pgAdmin create Region table
pgAdmin — Create Region Table

Press Play! Or hit the “F5” button on your keyboard to execute the query you just wrote:

pgAdmin — “Play”/Execute Button Toward the Right

See your Region Table in pgAdmin like so:

pgAdmin View Tables 1
pgAdmin — Schemas -> Tables -> Region

Currently, the region table is empty. Let’s fill it up with data.

You can download the data from the course itself. From your workspace, if you SELECT * from Region, you can download the CSV.

Download CSV from workspace
Download CSV from Workspace

Or you can go to this Github Repo, and download all CSVs. Your call.

Okay, so once you download the CSVs, go to your directory where you downloaded the CSVs on the terminal . I’m sure this can be done via the pgAdmin interface as well, but I couldn’t get it done — got an error during importing table data (my mac says the CSV file does not exist for some reason). If you know a way, please let me know in the comments below.

Once in the terminal and inside the psql CLI, connect to your DB with \c parchandposey Verify that you have the Region table in there with \d

Okay, now you can copy the data from the CSV into your Region table with \copy region from ‘region.csv’ DELIMITER ‘,’ CSV HEADER;

I’m assuming that you named your CSV file as “region.csv” here.

Check that you have the data with SELECT * from Region;

psql — Steps to Import Region Data

Aaand, Voila! There you have it. 4 records is all that the Region Table has.

Above, I named my DB parchandposey1 because I already have a parchandposey in my local machine.

Go back to pgAdmin to check the Region data:

pgAdmin Run a query
pgAdmin — Check Region Data

To the run query above, you know the deal. Right-Click on your DB name, go to “Query Tool”, enter your query, and press play. You see the same 4 records.

Create SalesReps Table

Similar to the steps above, you can now create your SalesReps Table. Here’s the CREATE TABLE statement for it:

CREATE TABLE public.sales_reps
(
id integer NOT NULL,
name character varying(255) COLLATE pg_catalog.”default” NOT NULL,
region_id integer NOT NULL,
CONSTRAINT sales_reps_pkey PRIMARY KEY (id),
CONSTRAINT region_id FOREIGN KEY (region_id)
REFERENCES public.region (id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
)

You can import the data in the exact same way that you imported Region Table Data above. After import, you can verify that the SalesReps Table has 50 records.

Create Accounts Table

Here’s the CREATE TABLE statement for Accounts Table:

CREATE TABLE public.accounts
(
id integer NOT NULL,
name character varying(255) COLLATE pg_catalog.”default” NOT NULL,
website character varying(255) COLLATE pg_catalog.”default” NOT NULL,
lat character varying(100) COLLATE pg_catalog.”default”,
“long” character varying(100) COLLATE pg_catalog.”default”,
primary_poc character varying(255) COLLATE pg_catalog.”default”,
sales_rep_id integer NOT NULL,
CONSTRAINT accounts_pkey PRIMARY KEY (id),
CONSTRAINT sales_rep_id FOREIGN KEY (sales_rep_id)
REFERENCES public.sales_reps (id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
)

You can import the data in the exact same way that you imported Region Table Data above. After import, you can verify that the Accounts Table has 351 records.

Create Orders Table

Here’s the CREATE TABLE statement for Orders Table:

CREATE TABLE public.orders
(
id integer NOT NULL,
account_id integer NOT NULL,
occurred_at timestamp with time zone,
standard_qty integer,
gloss_qty integer,
poster_qty integer,
total integer,
standard_amt_usd numeric,
gloss_amt_usd numeric,
poster_amt_usd numeric,
total_amt_usd numeric,
CONSTRAINT orders_pkey PRIMARY KEY (id),
CONSTRAINT account_id FOREIGN KEY (account_id)
REFERENCES public.accounts (id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
)

You can import the data in the exact same way that you imported Region Table Data above. After import, you can verify that the Orders Table has 6912 records.

Create WebEvents Table

Here’s the CREATE TABLE statement for Orders Table:

CREATE TABLE public.web_events
(
id integer NOT NULL,
account_id integer NOT NULL,
occurred_at timestamp without time zone,
channel character varying(255) COLLATE pg_catalog.”default”,
CONSTRAINT web_events_pkey PRIMARY KEY (id),
CONSTRAINT account_id FOREIGN KEY (account_id)
REFERENCES public.accounts (id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
)

You can import the data in the exact same way that you imported Region Table Data above. After import, you can verify that the WebEvents Table has 9073 records.

And that’s pretty much it.

Happy querying!

--

--