Normalized vs. Denormalized Databases
When I first started working with SQL, everything was in one table. Admittedly, the table looked about like this:
Most databases are going to have a liiiittle more data to them than four columns and five rows. Thus, the debate between normalized and denormalized databases has been raging for centuries (I’m talking in developer time). So what’s the difference?
Normalized databases involve multiple tables. Like data is organized into one table and other related data is put into a different table. You get to each piece of data through relationships to each table, mainly join tables. The good thing is, normalization reduces redundancy and maintains data integrity. Everything is organized into nice little tables where all the data that should stay together, does.
But, much like the downside of Rails, normalized databases can cause queries to slow down, especially when dealing with a shit ton (technical term) of data. This is where denormalized databases come in. While multiple joins are satisfying and fun, they can get complex quickly. When everything is in the same table, the query doesn’t have to jump from table to table to find what you’re looking for, it’s already right there. A downside is there will be redundant data — a small sacrifice to make to gain speed.
The best part about all this is you don’t have to choose one or the other; your database doesn’t have to be normalized to the nth degree or completely denormalized into one table. If your queries are running too slowly and it’s because of the complexity of the query, you can denormalize the database just enough to get rid of too many joins. Alternately, if your database is a giant mess of data and you need some organization, a little bit of organizing some data into different tables and getting rid of redundancy is in order. Databases can give you the best of both worlds, with a little TLC.