Geek Culture
Published in

Geek Culture

Data for Dummies: 2 Tips to Save Your Database

Everyone deals with data whether they like it or not. In many cases, people hate their data. Why?

  1. Their dataset isn’t structured correctly.
  2. It gets messier as more data comes in.
  3. Hard to find the right data when you need it.

Don’t worry. These problems can be reduced with the right database structure. But I know what you’re thinking.

I’m not a data nerd! How can I learn database structure!?

Even more “no worries”! You don’t have to be a data scientist to make a good and scalable database structure. In fact, you can build a great database even in Excel! Everyone knows Excel, so I’ll use it to teach the concepts in this article.

I’m going to give you two tips to save your database from further ruin, and I hope that they help you fall in love with your data!

A Well-structured Database is a Happy Database :)

Database Structure for Dummies — Tips

Tip #1: Use Child Rows, Not Columns

What can cause a lot of trouble if you don’t structure it early on is Parent -> Child data pieces. For example, if you have a Client (Parent data), they might also have deals or attend events (Child data) with you that are additive (can always have more data pieces).

Parent / Child Data Examples:

  1. Members (parent) & their member event attendance (child).
  2. Customers (parent) & their purchases (child).
  3. Businesses (parent) and their employees (child).
  4. Employees (parent) and their individual workdays (child).

In these cases, here’s a good tip to structure the parent/child data.

Create one datasheet for your Parent data and an entirely separate one (with a unique id matching key) for your Child data.

Good Parent/Child Data Structure Example

Main parent dataset of clients
Child dataset of Deals associated with clients.

Why? Well, to show why let’s talk about a bad way to set up this data structure. I.e. columns for child data points.

Bad Parent/Child Data Structure Example

Bad Bad Bad Parent/Child Structure.

Don’t worry if you’re currently structuring your data like the bad example here. MANY companies do, but let me convince you why you shouldn’t use this structure.

Think of data points such as webinar attendance. Let’s say you do 50 webinars in a year that people can attend. It would be a database NIGHTMARE to create additional columns for each new webinar data point for your clients. It would make your dataset get messy really quick with hundreds of columns. REALLY QUICK.

So how to avoid that 1,000 column-long nightmare dataset?

Simple → Use 2 datasheets: one parent, one matched-with-id child.

Why Use Parent / Child Separated Datasets?

  1. No matter how many more deals (child data points) your clients have, you can always add more. With certain databases, you have column limitations. In this case, there are no limitations on adding more rows.
  2. No need to make additional columns for new data points (events, deals, employees…)!
  3. Searching for individual child data like event attendance for your clients and parent data pieces becomes MUCH easier.
  4. For data nerds, it’s better for SQL queries!
  5. It’s a much cleaner database with much less hassle :)

So please, don’t use columns for new child data points. Use a different dataset, join it with the parent dataset, and save your data!

Tip #2: Unique ID Codes Will Save You!

Here’s a Problem: businesses change names, addresses, phone #s, and just about everything about themselves within 5–10 years. However, a lot of databases (especially in Excel) use those business names & changing points to track them.

Also, what if a name is spelled one way here and another way there? What if an email address changes? Capitalization? If your text doesn’t match exactly, you won’t find your data, and then oh goodness, just imagine the data chaos!

In fact, this is a super common problem, and many datasets suffer from it.

So how to solve this and save your data? Simple.

Use unique ids to track your clients in your internal database and reports.

Use unique ids to track your clients.

For external client reports, they need their names on the papers of course, but for internal reports, unique ids keep your data clean and immune from name changes and the like. It also helps match up your parent/child data with ease (see Tip #1).

Unique ids are especially great when you deal with a lot of clients and data because, statistically speaking, at least one or two of your clients is probably going to be changing something during a year. Also, some clients might have very similar names that databases can get tripped up on.

So use unique ids, track your data points, and save your data! :)

Final Thoughts

If you’re reading this trying to structure your database now, GOOD! Use these 2 tips to set up your database to be in good health.

If you’re reading this with a messed-up database, look at ways to mimic these tips so that your data can become cleaner. :)

TL/DR

Save your database by using separate datasets for parent/child data, and use unique ids to avoid name-change nightmares! Think about using data consultants to help improve your structure.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Nick Canfield

Nick Canfield

Founder of NomadYolo.com. 📈Data geek. 💻Coder. 💼Entrepreneur. 🌊🌴Digital Nomad.