Database Normalization

Aditya Desai
3 min readMar 13, 2022

--

In this blog i will try to explain what is Database Normalization. Database normalization is the process of structuring a database, usually a relational database, in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. It was first proposed by Edgar F. Codd as part of his relational model.

Normalization entails organizing the columns (attributes) and tables (relations) of a database to ensure that their dependencies are properly enforced by database integrity constraints. It is accomplished by applying some formal rules either by a process of synthesis (creating a new database design) or decomposition (improving an existing database design).

Database Normal Forms

Here is a list of Normal Forms:

  • 1NF (First Normal Form)
  • 2NF (Second Normal Form)
  • 3NF (Third Normal Form)
  • BCNF (Boyce-Codd Normal Form)
  • 4NF (Fourth Normal Form)
  • 5NF (Fifth Normal Form)
  • 6NF (Sixth Normal Form)

However, in most practical applications, normalization achieves its best in 3rd Normal Form.

First Normal Form (1NF)

For a table to be in the First Normal Form,

  • It should only have single(atomic) valued attributes/columns.

Here is our table, with some sample data added to it.

The above table is not in 1NF since the subject column consist of multiple values.

Now to solve this problem It’s very simple, because all we have to do is break the values into atomic values.

Here is our updated table and it now satisfies the First Normal Form.

Second Normal Form (2NF)

For a table to be in the Second Normal Form,

  1. It should be in the First Normal form.
  2. And, it should not have Partial Dependency.

What is Partial Dependency?

Partial Dependency is when an attribute in a table depends on only a part of the primary key and not on the whole key.

To remove Partial dependency, we can divide the table, remove the attribute which is causing partial dependency, and move it to some other table where it fits in well.

Third Normal Form (3NF)

A table is said to be in the Third Normal Form when,

  1. It is in the Second Normal form.
  2. And, it doesn’t have Transitive Dependency.

Transitive Dependency is when a non-prime attribute depends on other non-prime attributes rather than depending upon the prime attributes or primary key.

To remove Transitive dependency, Again the solution is very simple. Take out the columns which creates transitive dependency.

Advantage of removing Transitive Dependency

The advantage of removing transitive dependency is,

  • Amount of data duplication is reduced.
  • Data integrity achieved.

Boyce and Codd Normal Form (BCNF)

For a table to satisfy the Boyce-Codd Normal Form, it should satisfy the following two conditions:

  1. It should be in the Third Normal Form.
  2. And, for any dependency A → B, A should be a super key.

In simple words, it means, that for a dependency A → B, A cannot be a non-prime attribute, if B is a prime attribute.

--

--