Normalization in Database

Rohit Kumar Kushwaha
3 min readNov 28, 2021

--

Normalization is a process of eliminating or reducing the redundancy of schema or table or relation. This is called normalization.

Redundancy can occur in relation or schema or table if two or more independent relations are stored in single relation.

Problem due to Redundancy:

Redundancy causes database anomalies(inconsistency).

1. Insertion Anomaly:

To insert some data in the table force to insert other independent data. which create an insertion anomaly.

2. Deletion Anomaly:

Because of deletion of some data from the table forced to delete other independent data. This is called deletion anomaly.

3. Updation Anomaly:

If some redundant copy of data is updated and other redundant copies failed to update which cause inconsistency. That is called an updation anomaly.

This anomaly creates redundancy in the table. that is why normalization concepts come. Normalization tells us to decompose the relation into two or more sub relation to reduce or eliminate database anomalies/redundancy.

Type of Normal forms:

There are four types of normal forms as given below.

  1. First Normal Form(1NF)
  2. Second Normal Form(2NF)
  3. Third Normal Form(3NF)
  4. Boyce Codd Normal Form(BCNF)

Trivial Functional Dependency: If X->Y functional dependency is a trivial dependency if and only if Y is a subset of the X.

Example: Sid,Sname->Sid is trivial functional dependency.

Non-Trivial Functional Dependency: If X->Y functional dependency is a non-trivial dependency only if X and Y attribute sets should not consist of any common attributes.

Example: Sid->Sname is non-trivial functional dependency.

Semi Non-Trivial Functional Dependency: Combination of trivial and non-trivial dependency.

Example: Sid->Sid, Sname is semi non-trivial functional dependency.

1. First Normal Form(1NF):

A schema or table will be 1NF if it contains a single value in each cell of the table. It is also the default normal form of RDBMS tables. In this normal form degree of redundancy is high. The first normal form doesn’t contain the multivalued or composite attribute.

2. Second Normal Form(2NF):

In the 2NF form, the relation must be in 1NF and relational schema (R) in 2NF if and only if there is no partial dependency in relational schema(R).

Candidate Key: Minimal set of attributes used to differentiate all records uniquely in the relational schema.

Prime Attributes: Attribute which belongs to some(at least one) candidate keys of relational schema.

Non-Prime Attributes: A set of attributes which not belong to any candidate key part is called a non-prime attribute.

For example: if x determines y (x->y) and x is the candidate key then the proper subset of x should not determine the non-prime attribute.

(Proper subset of candidate key)->(Non-prime attribute) which is not allowed in 2NF.

3. Third Normal Form(3NF):

If the relation or table is in 3NF then it must be 2NF and should not contain transitive dependency. The non-prime attribute should not determine another non-prime attribute. This is called 3NF.

Super Key: Super key is a set of keys to identify rows in a table uniquely. It is the superset of candidate keys.

For Example: If there is dependency (x->y) then x must be super key or y must be a prime attribute.

X(super key)->Y(prime/non-prime).

4. Boyce Codd Normal Form(BCNF):

If the relation or schema or table is in BCNF then it must be in 3NF and every non-trivial functional dependency (x->y) x must be super key.

For Example: If X->Y is a functional dependency then X must be super key.

X(super key)->Y.

These are the normal form is used to eliminate or to reduce the redundancy in schema or table. These normal forms are used to remove the insertion, deletion and updation anomaly in the relational schema or table.

--

--