Database Normalization

Ayush Yadav
3 min readMar 13, 2022

--

Normalization is a process of removing the redundancy from our table because redundancy may cause insert, delete or update anomalies.

But what is anomalies?

Database anomaly is normally the flaw in databases which occurs because of poor planning and storing everything in a flat database.

Insert anomaly :-

If a tuple is inserted in referencing table and referencing attribute value is not present in referenced table, it will not allow inserting in referencing relation.

Delete anomaly :-

If a tuple is deleted from referenced table and referenced attribute value is used by referencing table, it will not allow deleting the tuple from referenced table.

Update anomaly :-

If a tuple is updated from referenced table and referenced attribute value is used by referencing table, it will not allow updating the tuple from referenced table.

Types of Normalization :-

1NF(First Normal Form) :-

Each attribute of a table must have single values only then we can say that the table is in 1NF.

ID   Name   Courses
------------------
1 A c1, c2
2 E c3
3 M c2, c3

the above table is not in 1NF because it contain multiple values in single attribute.

ID   Name   Course
------------------
1 A c1
1 A c2
2 E c3
3 M c2
3 M c3

Now, the above table is in 1NF.

2NF(Second Normal Form) :-

In the second normal form, all non-key attributes are fully functional dependent on the primary key and It should be in 1NF.

CustomerID   StoreID  Location
-------------------------------
1 1 Delhi
1 3 Mumbai
2 1 Delhi
3 2 Bangalore
4 3 Mumbai

CustomerID and StoreID are candidate keys, and Location is a non-prime attribute.

However, if we notice that Candidate Key(CustomerID, StoreID) are not determining the location, only StoreID which is a part of Candidate Key is determining the location, that means our table is partially independent and it does not belong to 2NF.

For our table to be in 2NF, we must divide it.

Table Name :- ACustomerID   StoreID 
---------------------
1 1
1 3
2 1
3 2
4 3
Table Name :- BStoreID Location
-------------------
1 Delhi
2 Bangalore
3 Mumbai

Since StoreID is the candidate key which determines the location in table B and CustomerID and StoreID are the candidate keys which do not determine anything in table A, our table is in 2NF.

3NF(Third Normal Form):-

A relation is in 3NF when it is in 2NF and there is no transitive dependency or a relation is in 3NF, when it is in 2NF and all non-key attributes directly depend on candidate key.

ID     Name           State              City
-----------------------------------------------
1 Ayush Madhya Pradesh Indore
2 Aman Haryana Ambala
3 Akshat Punjab Mohali
4 Amit Madhya Pradesh Indore
5 Riya Haryana Ambala

ID is a Candidate Key.

Name, State and City are Non prime attribute.

As we can see, ID determines the State, which means candidate key determines the non-prime attribute, but State itself determines the City, which means non-prime attribute determines the non-prime attribute. By using that ID, the city is also determined.

In simple word,

ID -> State and State -> City

So, ID -> City

which shows the transitive dependency in our table.

Table Name:- AID     Name           State          
-----------------------------------
1 Ayush Madhya Pradesh
2 Aman Haryana
3 Akshat Punjab
4 Amit Madhya Pradesh
5 Riya Haryana

Table Name:- B
State City
------------------------------
Madhya Pradesh Indore
Punjab Mohali
Haryana Ambala

Table A has Id as a Candidate key, and Table B has State as a Candidate key determining the Non-prime attribute.

BCNF(Boyce and Codd Normal Form) :-

Also known as a special form of 3NF as BCNF puts a restriction on 3NF. In BCNF each functional dependency should be derived from Candidate Key or Super Key.

--

--