Getting Started With SQL By Answering Business Related Questions Using an SQLite Database and GUI

Taylor Bickell
The Startup
Published in
11 min readSep 2, 2020

--

Solving problems is one of the best ways to learn and reinforce new concepts. In this case, we’ll be applying the most common clauses of the Structured Query Language (SQL), using a relational database (i.e. SQLite), and a graphical user interface (GUI) such as TablePlus to solve business-related questions.

Before we get started, there are a couple of housekeeping items we need to address.

  1. Download the sample SQLite database.
  2. Install TablePlus or another compatible GUI.

Download a sample SQLite database

The database we’ll be using is a sample from the SQLite tutorial website.

This is a visual schema of the sample database.

By taking a closer look at the schema, a blueprint of the construction of a database, we are able to gain a visual sense of what a database contains. SQLite is an example of a relational database, and we can see relations by observing the connections formed between the keys of different tables. I’ll explain this in a little more detail later on, however for now, the most important thing is to grasp some visual understanding of what the database is comprised of (i.e. Tables that hold varying columns of different data types and relationships between tables.).

The download of the sample database I’ll be using can be found here. About halfway through, you should find a link for downloading the sample SQLite database. Click it when you’ve found it.

After downloading the database, you’ll notice that it’s in a ZIP format. If you don’t already have a preferred software for unzipping a file, 7-zip is a free service recommended by the SQLite Tutorial website.

Once you’ve unzipped the file, it’s helpful to move it to a place that’s easy to access. For example, I chose to move the unzipped file to my desktop.

Install TablePlus or another compatible GUI

From here we need the assistance of a program that will help us open and connect to the SQLite database. I’ve chosen to use TablePlus, a modern GUI tool for relational databases. TablePlus is available for download on both Windows and Mac devices, and offers a free tier with plenty of functionality for our purposes. After downloading and opening TablePlus on your device, a window with the below information should appear.

(Note: There may be some slight design differences between the Mac and Windows TablePlus applications. I happen to be using a Windows device.)

If you’re using TablePlus for the first time, your window should look just like the above. You’ll see that there aren’t any preexisting connections which is to be expected. That’s all about to change though. 😃

We now have our two primary housekeeping items complete.

✔ Downloaded SQLite database.

✔ Installed compatible GUI (i.e. TablePlus).

Connect the SQLite database using TablePlus

Given that the database is SQLite, we’ll want to select the SQLite option when creating a new connection within TablePlus.

Select SQLite when creating a new connection in TablePlus.

It’s now time to pick a name and choose the location your database file is stored in. A password is optional, however I do recommend testing the connection by clicking “Test”. After verifying that the connection is okay, select “Save”.

Once you’ve selected the type of database you’re using, it’s time to name and create a connection to the database.

Looking at the window below, we see that the database has been successfully saved.

Double click the newly created database to connect to it. The below window should appear within TablePlus.

On the left hand side, all of the tables within the database are displayed. You may recognize some of these from the visual schema.

Relational Tables and Common SQL Clauses

Before introducing questions, I want to circle back on something I briefly mentioned earlier. We know that the database is comprised of tables and that each table contains varying types of data. For example, take a closer look at the structure of the “artists” table. What columns and data types do you see? It has two columns, ArtistId and Name, both of which have two different data types. ArtistId is an integer data type while Name is a string of up to 120 various characters.

This is the structure of the “artists” table.

Earlier, I also mentioned that the tables within the database have relationships that can be seen throughout the visual schema. To illustrate, I’ve zoomed in on a relationship between the “artists” and “albums” tables.

The relationship between the “artists” and “albums” tables is an example of a one-to-many relationship, as each artist can produce 0, 1 or more albums.

The “artists” table contains what is called a primary key (That’s what the gold key next to the column name signifies.). A primary key is used to uniquely identify each row in a table and can be thought of as the street address of each observation or row. As we look more closely at the relationship between these two tables, we see that the primary key in the “artists” table, ArtistId, is related to ArtistId in the “albums” table. ArtistId in the “albums” table represents what we call a foreign key which simply means that it is a primary key in another table. In this case, it’s the primary key in the “artists” table. Table structure, data types and relationships are all things to keep in mind when thinking about how to use the contents of a database to approach solving questions.

One last thing to note are some of the most common clauses used to write SQL queries. Depending on how new you are in your SQL journey, you may or may not have seen some of these before. Regardless, through continued practice you’ll come to discover how user friendly and powerful SQL is. It’s the use of clauses like the ones below that enable us to interact with and extract insights from data in meaningful ways.

SELECT
FROM
JOIN
WHERE
GROUP BY
ORDER BY

As we answer different questions, we’ll start to grasp and see more specifically what each of these clauses do.

Let’s get started!

Assuming that you still have TablePlus open and running, select the button highlighted in green to open the SQL Editor. The editor is where we will be writing our SQL queries.

Once you’ve opened the SQL Editor, your window should look like what’s below.

Introducing the SELECT and FROM Clauses

The SELECT clause determines what columns will be returned by a query, and it is also the only clause that can be used without the use of another clause. This means that it can be used by itself. An example of a query that demonstrates this is:

SELECT
"Lauren" as name,
24 as age,
1*5 as favorite_number

(Note: I used 1*5 to represent a favorite number for the purpose of demonstrating that SQL can perform arithmetic operations. In addition, we can alias a column name by using as.)

Enter the query into the editor and click “Run Current”.

The output of the query is highlighted in purple.

We can see that the names of the columns we aliased in the query in addition to the specific values we included for each column are represented in the ouput. Again, we did all of this using just the SELECT clause.

If we now wanted to view all columns from a particular table in the sample database, we would need to use both the SELECT and FROM clauses. We know that SELECT dictates what columns we’ll see in our output, but what does the FROM clause do? You may have intuitively guessed it. The FROM clause allows us to specify the particular tables we’re looking to output data from.

Here, we’ll use SELECT and FROM to view all columns from the “albums” table within the database.

SELECT *
FROM albums

(Note: The asterisk (*) denotes all columns.)

From the output, we see that the “albums” table consists of three columns, AlbumId, Title and ArtistId.

The results set displays all columns included in the “albums” table and consists of 347 rows.

Use SQL to Answer Business Related Questions

Although not necessary, I suggest opening the visual schema in another tab. Having the visual schema easily accessible will not only serve as a useful reference but also help deepen familiarity and understanding of the database composition. Click here to open the schema.

Now that we have some initial knowledge to lean on and somewhat of an intuitive sense of the database and SQL, we’re officially ready to tackle some questions! 🙌

Question 1. What is the full name of each customer?

This is a great time to look at the schema and think about what table contains data that’s most relevant to answering the question. Given that we’re interested in customer data, the “customers” table is a great place to start.

Looking at the columns of the “customers” table, the information we need is found in the CustomerID, FirstName and LastName columns.

Query:

SELECT
CustomerId,
FirstName,
LastName
FROM customers

After running the query, we now not only have the full names of customers but also know that we have 59 customers in total (based on the number of rows the query returned).

We can verify the number of unique customers by using an aggregate function, in this case COUNT. It’s important to specify “distinct” to ensure that only unique values are returned. From the output of the query below, we can confirm that there are 59 unique customers.

This query confirms that there are 59 unique customers.

Question 2. What are the phone numbers and email addresses of customers who live in the USA?

A good portion of the previous query can be used to answer this. In addition to the columns we selected above, we’ll want to include Phone, Email and Country. We now know what columns and the specific table we need to reference, however, how do we return results that only include customers in the USA?

By using the WHERE clause, we’re able to filter the country from which customers live. The WHERE clause filters the data by row through the use of logic.

Query:

SELECT
CustomerID,
FirstName,
LastName,
Phone,
Email,
Country
FROM customers
WHERE Country = "USA"

The results of this query include the full names, phone numbers and email addresses of customers who live in the USA.

There are 13 customers who live in the USA.

Question 4. How many customers are located in each country?

From the question above, we know that the “customers” table contains the country from which a customer is located. By selecting Country and counting the number of unique customers (Just like we did as an extension of question 1, however this time using a different name for the column.), we can use the GROUP BY clause to group rows that have the same value. A way to think about what GROUP BY is doing here is to view each row as a summary of customers who live in a given country.

Question 5. What three countries have the most customers?

Using the same query above, we can include the use of the ORDER BY clause to organize the total customers in each country. We want this to be in descending order so that the largest values are listed first. From here, using LIMIT will enable us to reduce the number of rows returned. We’re looking for the top three countries. Therefore, the limit is 3.

The top three countries are the USA, Canada and Brazil.

(Note: To answer the questions above, we only needed to reference one table in the database. These next questions will require that we reference more than one table. To accomplish this, we’ll be implementing the final common clause from the list mentioned earlier.)

Question 6. What is the name of the artist of each album?

After reading this question and referencing the schema, there are two tables we’ll need to answer this question. The “artists” table, and the “albums” table.

Let’s break down this question into parts to most clearly lay out a way for thinking through it.

Again, we know that the album information we need is in the “albums” table. To see the number of albums we’re trying to find artists for run the query below.

There are 347 albums in total.

Next, to get artist names, we’ll need information from the “artists” table. We can confirm this by looking at the schema. Notice that there is a column within the “artists” table entitled Name.

You may recall that I highlighted the relationship between the “artists” and “albums” tables, and we learned that both tables include an ArtistId column. This is a critical piece for being able to perform the last common clause, JOIN.

(Note: With continued learning of SQL, you’ll discover that there are different types of joins. For the purpose of this article however, I’ll just be covering the type of join that addresses the specific question.)

Thinking back to question 6, we know that there are 347 albums that we need artist names for. We also know that we can get artist names by accessing the Name column in the “artists” table. To combine the information from these two tables, we’ll apply what’s called an INNER JOIN. An INNER JOIN returns data that is shared or matching between tables. To successfully run an INNER JOIN, given that we’re now working with more than one table, we’ll need to reference the specific table and the columns from that table we want to include. Take a look below.

The output of the query includes each album with an artist name.

Question 7. How many albums has each artist produced?

For the last question, we’ll implement a little bit of what we’ve learned from each previous question. Because we identified that the “artists” and “albums” table share a one-to-many relationship (an artist can produce 0, 1 or more albums), we want to ensure that we return the results of all artists regardless of whether or not they have produced an album. This is what makes a LEFT JOIN a good choice for solving our final question.

We now have a total number of albums that each artist has produced.

Bonus Question. What are the names of the Top 10 highest producing artists (based on number of albums)?

(Note: I’ve chosen not to include a query, however, I can assure you that all the information you need to solve this question you’ve applied by working through this article. I have included the correct output you should get after querying for reference.)

These are the Top 10 artists and the number of albums each has produced.

Primary Accomplishments 👏

· Use an SQLite database.

· Learn to use a GUI like TablePlus.

· Answer business related questions with SQL.

Thank you for reading this article! Should you have any questions or comments, please leave a response. I’d also love to connect! The best place to find me is on LinkedIn or Twitter. :)

--

--