databases like im 5: relational databases! (article 1)

ai like im 5

8 min readMar 19, 2024

before i start, a couple things

databases are incredibly important for anyone in the data science/ai/ml space and i think they are hard to break into.
so i think that sql, nosql, and a lot of the other data/database tools deserve their own multiple part series.
this is purely about data storage and retrieval, there is not a true relation to machine learning, or deep learning so feel free to skip over it.

anyways,

welcome to the series on

sql?
sequel?
squeal?
squirrel?

yeah… i give up on pronouncing.

sql is a very important skill in the world.

so even if we can’t agree on the correct pronunciation, we still use it a lot and should master it.

note: before we dive in, be prepared for a lot of database buzz words. i am convince these are designed just to confuse you and make people sound smarter than they actual are.
a great video about this.

because the best way to learn databases is to build a database,

let’s build a database:

food waste is a major problem in america.

when i worked as a wedding/banquet server, we would serve food every weekend.
and at the end of the night, we were often allowed to eat and take food home.
after a while, we got sick of eating the same stuff and didn’t even want to eat it.
and as a result, some nights, we would have to throw food away, a bunch of it.
to someone else, that is a incredible meal, but to us who have it had 1000 times, we were good.

million dollar idea: what if there was an app that allowed restaurant workers to trade food with each other

allowing people to try all different types of food.
allowing people to meet up and make connections.
and even the option of using a delivery app, like uber eats, to make help with the logistics.

so you go and build that app and realize that you are going to have to handle data such as:

user data → names, profiles, pictures, locations, and more
food → tracking types of food, what is available, descriptions, how long it’s been available for
transactions → details of what was traded, when it happened, and more

and if it successful, there is going to be a lot of data…

a weally weally big amount of data

for context, facebook has about a million gigabytes of data, do you think they store everything in

excel files
on local computers
and had people manually clicking and dragging everything

no, they use databases!

so if you want your app to be successful and your users to not feel like it’s a calculator running on potatoes

(even though you can run some cool stuff on a calculator running on potatoes)

you are going to want a database

databases are data storage systems optimized up the wazoo allowing for efficient retrieval, storage, and more
they are incredibly secure, scalable, and super easy to integrate!
these databases will only store necessary and instantaneous data, meaning that we have others solution for archived and historical data we will talk about in the future.

there are lots of database options pertaining to different use cases and more but we will maintain focus on the most common and best type:

relational databases

the most popular form of databases.
based on relation algebra, it is centered on a very important idea, relations.
in plain english: relational database are intertwined and connected together.

tables

at the heart of a relational database is a table
a table is just a optimized dataset/spreadsheet, also made of rows and columns.
you will sometimes see these called entities
and the the columns called fields
but no need for buzz words or corporate jargon

for our app, we need to create a basic table to store user information

let’s take some very basic information about our users

a primary key is essentially a unique identifier and index
although it can be any column, it must be non-null(every corresponding row must have a value) and a unique value
but for simplicity and many other reasons, we will often create another column for the primary key.

let’s create a table for food:

each user is going to have some type of food
should we store all the user information again or is there a better way to do things…

not rocket science!

foreign keys are like our friend borat, it is a foreigner connecting
worlds together.

for example, if we wanted to know the name of who has tom yum soup

we would search the food table and look for that dish name
find that corresponding userid
go back to users table and look at the user information
finally finding the name for that id!

there’s no need to re-store the name of who has tom yum soup!

and now we we create a table for transactions

we have to be super careful when creating keys like this though
having good design is incredibly important bringing me to my next point

schemas

relational databases are sensitive, meaning that making a lot of changes to the structure of tables, their features, and their keys is probably not the best idea.
before we go out and blind build a very complex database like this, we should probably draw it out and get a thought out plan of how to design a database.

we call these thought out plans schemas!

schemas are the logic behind databases explaining x and y
there are many different types of schemas in the data world but we don’t have to dive too deep into them for now
some are visual and some are a bit more complex
but just get a basic though idea of how to construct a diagram that shows each table (entity), it’s columns (fields), connections (keys), what is lacks, and more

some great questions to ask based on this

is there redundant information?
- we call the process of evaluating this normalizing
- it is a deep topic!
are the keys correct and well thought out?
what data types should we use for each column/field, should they contain null (empty values), and is the information sensitive or a special type of data?
what are we missing from the tables?

and most importantly, how scalable is the data?

i know it sounds weird, but we should design our database, thinking 5 to 10 to even 25 years down the line
because of sql’s sensitivity, scaling requires intricate database design from the beginning.

relational databases can be like dominoes, if one thing goes down, the whole system could too.

if you don’t start with a great base, stuff like this can happen.

so for this reason, an important thing to grasp is:

types of relationships

it’s a really good idea to constrain our table relationships
or in plain english: we should specify that exact relationship, we when we create a table.

there are going to be 3 main types of these relationships

one to one
one to many/many to one
many to many.

don’t worry if it doesn’t make sense yet, it will click as things go on.
i chose a not totally clear example, because there is not always a 100% correct choice for constraining
this will hold true with all lot of other things in the database world.
sometime there is a clear answer but each has it’s own pros and cons so most of the time it depends on the situation at hand!

but in my opinion, one to many makes the most sense!

building a better database

let’s fix up our user information table

password data needs security. we should encrypt the passwords with hashing (nerd encryption)
adding in phone numbers, active status, restaurant, and last login data would be a good idea.
and we should store the address/location of our users.

more of that relational database beauty!

transaction table

our transaction table is pretty good
but what about that deliver feature, we should probably store some information about that!
we’re going to need a couple more tables to handle that!

although you can’t see the full power of the key system yet, hopefully you can see this type of design makes everything so much simpler and scalable.

we can directly access the user’s address

food items

our food items table is pretty good, but let’s add some nutritional info, and a image type.

we have a great foundation and now we can design:

a refined schema!

relationships in this schema!

users to addresses: one-to-one

each user has one address

2. users to food items: one-to-many

a user can list multiple food items

3. transactions to users (buyers and sellers): many-to-one (twice)

each transaction has one buyer
each transaction has one seller
but each seller and buyer can have multiple transactions

4. transactions to food items: one-to-one

each transaction references one food item

5. transactions to delivery (optional): one-to-one

each transaction may be linked to one delivery

6. delivery to user address: one-to-one

each delivery is linked to a user’s address

7. delivery to couriers: many-to-one

each delivery can be linked to a courier
a courier can do multiple deliveries

8. delivery to transactions: one-to-one

each delivery is linked to a one transaction

what would our database look like without these 8 relationships!

anyways,

learning databases is hard, because most of the stuff is incredibly simple. but the only way to really learn it, is to encounter it. so why not just build an entire database and fill it with fake data and fake scenarios! that’s what i will do.

anyways, here is an absolutely insane video:

have a amazing day and

databases like im 5: relational databases! (article 1)

before i start, a couple things

anyways,

sql is a very important skill in the world.

let’s build a database:

and if it successful, there is going to be a lot of data…

a weally weally big amount of data

relational databases

tables

let’s create a table for food:

and now we we create a table for transactions

schemas

types of relationships

building a better database

let’s fix up our user information table

transaction table

food items

a refined schema!

relationships in this schema!

anyways,

godspeed!

Written by ai like im 5