Normalization in DBMS

Harsh
NiXiS Institute
Published in
5 min readApr 13, 2020

Normalization is the progress of collecting data in database to avoid data redundancy , Anomalies i.e Insertion , deletion and Update Anomaly .

Let’s first discuss about Anomalies :-

There are three types of Anomalies that occurs when database table is not normalized . Let’s discuss all those :

Example of Anomaly Table
  1. Insertion Anomalies :- Suppose a new employee joins the company and who is under training . The company does not insert their name in the table if emp_dept does not allows nulls , because any department is not assigned .
  2. Update Anomalies :- In the above table , there are two employees with same name(Mansih) in the company with different department . If we want to update the address of Mansih , then we have to update the same in two rows or the data will become inconsistent. If somehow, the correct address gets updated in one department but not in other , then as per the database, Mansih would be having two different addresses, which is not correct and would lead to inconsistent data.
  3. Delete Anomalies :- Suppose, if at a point of time the company closes the department D890 , then deleting the rows that are having emp_dept as D890 would also delete the information of employee Ankit since she is assigned only to this department.

To overcome these anomalies we need to normalize the data.

Let’s discuss about Normalization :-

There are four types of Normalization which is used commonly :-

  1. First Normal Form(1NF)
  2. Second Normal Form(2NF)
  3. Third Normal Form(3NF)
  4. Boyce & Codd Normal Form(3.5NF)

First Normal Form(1NF) :-

As per the rules , attribute(column) of a table should contain only atomic values . It cannot contain multiple values .

Let’s Take an Example :-

In the above table , company stores two phone number of Ram in same column . This violates the rule of 1NF .

To make above table in 1NF , the data should be like :-

Second Normal Form(2NF) :-

As per rules , The table said to be 2NF if :-

  1. The table should be in 1NF .
  2. ^ No non-prime attribute is dependent on any subset of candidate key of table .

^ An Attribute is not a part of any candidate key , is called Non-prime Attribute .

Example: Suppose a school wants to store the data of teachers and the subjects they teach. They create a table that looks like this: Since a teacher can teach more than one subjects, the table can have multiple rows for a same teacher.

The table is in 1NF , but not in 2NF because teacher_age(non-prime attribute) is dependent on teacher_id (candidate key) . So it violates the rule of 2NF .

To make above table in 2NF , we break the table in two separate table :-

Third Normal Form(3NF) :-

As per rules , the table said to be 3NF if :-

  1. The table should in 2NF .
  2. Transitive functional dependency of non-prime attribute on any super key should be removed .

functional dependency :- X →Y , X is a super key .

Example: Suppose a company wants to store the complete address of each employee, they create a table named employee_details that looks like this:

Here, emp_state, emp_city & emp_district dependent on emp_zip. And, emp_zip is dependent on emp_id that makes non-prime attributes (emp_state, emp_city & emp_district) transitively dependent on super key (emp_id). This violates the rule of 3NF.

To make this table 3NF , we have to break the table into two tables to remove the transitive dependency:

Boyce Codd normal form (BCNF) :-

  • It is advance version of 3NF .
  • So it is also known as 3.5NF .
  • It is stricter than 3NF .

As per rules , the table is said to be BCNF if:-

1 . The table should be in 3NF .

2. The table should be functional dependent i.e X → Y , X is always a super key .

Example: Suppose there is a company wherein employees work in more than one department. They store the data like this:

Functional dependencies in the table above:
emp_id -> emp_nationality
emp_dept -> {dept_type, dept_no_of_emp}

Candidate key: {emp_id, emp_dept}

The table is not in BCNF as neither emp_id nor emp_dept alone are keys.

To make the table comply with BCNF we can break the table in three tables like this :-

Candidate keys:
For first table: emp_id
For second table: emp_dept
For third table: {emp_id, emp_dept}

This is now in BCNF as in both the functional dependencies left side part is a key.

--

--