DB Normalization: 1NF, 2NF, 3NF, BCNF….

Shishir Mohire
3 min readNov 1, 2023

--

Image Depicting how normalization can make your data more cleaner and faster

What is Normalization:

It is the process of structuring your database tables in order to minimize data redundancy, make updating tables easy, and improve data integrity.

Minimizing data redundancy makes sure that the same data is not being stored twice or is not repeated at multiple places unnecessarily. Improving Data integrity makes sure that data is consistent, reliable, and trustworthy.

To convert the simple table to normalized form, one of the tech-bro from 1970 came up with a few steps. Each of these steps converts the DB into separate normal forms called 1NF, 2NF, 3NF, BCNF, 4NF, and 5NF. In this article, we will try to explore what these normal forms are and what they stand for. But before that first, let's understand a few core DB concepts.

Types of Keys:

Super Key

It is a set of attributes that can be used to uniquely identify each tuple of a table. So basically if the closure of the set of attributes determines all the attributes of the table then that set of attributes is the Super Key of R. ( For those who dont know, the closure of the set of attributes ‘X’ indicates the set of attributes ‘Y’ which can be identified using ‘X’.

Candidate Key

It is a minimal set of attributes that can be used to identify each tuple uniquely. A super key is a candidate key if none of its proper subsets is a super key.

Primary Key

It is one of the candidate keys selected as a primary means to identify a tuple. Primary key values should never be null. There should be one primary key per table.

Foreign Key

It is a column or group of columns in a relational database table that refers to the primary key in another table or the same table. The concept of referential integrity is derived from foreign keys. Referential integrity along with other integrities makes sure data integrity is maintained well. Referential integrity ensures that data referenced using foreign keys actually exists and is consistent.

Types of Attributes:

Prime attributes

Prime attributes are attributes that are part of any of the candidate keys.

Non-Prime attributes

Non-prime attributes are attributes that are not part of any of the candidate keys.

What is functional Dependency?

Functional dependency indicates the dependence on certain attributes of of table on some other attributes of a table. Ex. All the attributes of the table should ideally be functional dependencies of the primary Key. Below can be termed as some of the types of functional dependencies.

Partial dependency:

When a non-prime attribute is dependent only on some part of the candidate key i.e. dependent on one of the proper subsets of the candidate key then it is called partial dependency.

Full dependency:

When a non-prime attribute is dependent on the entire candidate key then it is called full dependency.

Transitive dependency:

A functional dependency from a non-prime attribute to a non-prime attribute is called a transitive dependency.

Now that we have an understanding of the basic concepts of DB and tables. Let's understand different types of normal forms.

Types of Normal Forms

1NF (1st Normal Form)

To bring the table in 1NF, each cell should contain indivisible atomic values. Should not contain any duplicate rows. It should have a primary key and each column should be unique. If all these conditions are met then the table is in first normal form.

2NF(2nd Normal Form)

To bring the table to 2NF it should be in 1NF first and it should not have any partial dependency. You should decompose your table into 2 separate tables in case you have partial dependency. Create the table for the attribute on which partial dependency exists. Make that attribute the primary key of the newly formed table.

3NF(3rd Normal Form)

To bring the table to 3NF it should be in 2NF first and it should not contain any transitive dependency. So basically for every functional dependency (a to b) in the 3NF form table, either ‘b’ should be a prime key or ‘a’ should be a super key.

BCNF

To bring the table into BCNF, all the functional dependencies ( a to b ) a should be a super key.

There is a lot to explore and understand in database normalization. Please do like, share, and follow if you find this article useful.

--

--

Shishir Mohire

Software developer, JP Morgan Chase, Ex-Goldman Sachs.