Normalization In DBMS

Neeraj Tiwari
4 min readSep 12, 2021

--

This post explains database normalization terminology for beginners. A basic understanding of this terminology is helpful when discussing the design of a relational database.

Normalization is the process of organizing data in a database. This includes creating tables and establishing relationships between those tables according to rules designed both to protect the data and to make the database more flexible by eliminating redundancy and inconsistent dependency.

Redundant data wastes disk space and creates maintenance problems. If data that exists in more than one place must be changed, the data must be changed in the same way in all locations. A customer address change is much easier to implement if that data is stored only in the Customers table and nowhere else in the database.

What is inconsistency in DBMS?

Inconsistency is generally compounded by data redundancy. However, it is different from data redundancy and its adjacent anomalies in that it typically refers to problems with the content of a database rather than its design and structure.

What is redundancy in DBMS?

Data redundancy in the database means that some data fields are repeated in the database. This data repetition may occur either if a field is repeated in two or more tables or if the field is repeated within the table. Data can appear multiple times in a database for a variety of reasons.

For example, a table may have the same standard appearing several times if that student has got inserted.

Types of Normal Form

There are 6 types of Normal Forms we need to achieve a Normalization portfolio:-

  • First Normal Form
  • Second Normal Form
  • Third Normal Form
  • Boyce-Codd Normal Form
  • Fourth Normal Form
  • Fifth Normal Form

Here we discuss only 4 Normal Forms.

1NF (First Normal Form) Rules:-

  • Eliminate repeating groups in individual tables.
  • Create a separate table for each set of related data.
  • Identify each set of related data with a primary key.
  • A relation is in 1NF if it contains an atomic value.

Example:-
Relation STUDENT in table 1 is not in 1NF because of the multi-valued attribute STUD_PHONE. Its decomposition into 1NF has been shown in table 2. So that the multivalued attributes are removed.

2NF (Second Normal Form) Rules:-

  • A relation will be in 2NF if it is in 1NF.
  • All non-key attributes are fully functional dependent on the primary key.
  • Create separate tables for sets of values that apply to multiple records.
  • Relate these tables with a foreign key.

Example:-

In the figure we can see, the non-prime attributes i.e. StudentName and ProjectName should be functionally dependent on part of a candidate key, to be Partial Dependent.

We have introduced two new tables called <StudentInfo> and <ProjectInfo>. The StudentName can be determined by StudentID, which makes the relation Partial Dependent. The ProjectName can be determined by ProjectID, which makes the relation Partial Dependent. Therefore, the <StudentProject> relation violates the 2NF in Normalization and is considered a bad database design.

3NF (Third Normal Form) Rules:-

  • A relation will be in 3NF if it is in 2NF.
  • Has no transition dependency exists.

Third Normal Form is used to reduce the data duplication when combining the data from different tables. If there is no transitive dependency for non-prime attributes, then the relation must be in third normal form.

Example:-

The above table is not in 3NF because it has a transitive functional dependency −

Movie_ID -> Listing_ID
Listing_ID -> Listing_Type

Therefore, Movie_ID -> Listing_Type i.e. transitive functional dependency.

To form it in 3NF, you need to split the tables into two parts called <Movie> and <Listing> and remove the transitive functional dependency.

Boyce-Codd Normal Form Rules:-

  • Every functional dependency X -> Y, X is the super key of the table.
  • Table should be in Third Normal Form, and for every functional dependency X -> Y, X must be a super key.

Boyce–Codd normal form (or BCNF or 3.5NF) is a normal form used in database normalization. It is a slightly stronger version of the third normal form (3NF).

Summary:-

  • Database designing is critical to the successful implementation of a database management system that meets the data requirements of an enterprise system.
  • Most database systems are normalized databases up to the third normal form in DBMS.
  • A primary key uniquely identifies all records in a Table and cannot be null.
  • A foreign key helps connect the table and references a primary key.

--

--