Database Normalization
Normalization is the key way to avoid database anomalies and redundancy. It is just a design technique.
It is a structured way to define the attributes. To reduce the anomalies and redundancy we have to follow the normalization process. It has the step normal forms that we have to follow. And here the end state will be minimal anomalies and minimal redundancy.
Types of Normalizations
- Here each
normal form
is the process, used to reduce anomalies, and redundancy. 0 throughBCNF (Boyce-Codd Normal Form)
are the most common normal forms. - Boyce-Codd Normal Form is also referred to
3.5 normal form
. - The fourth and fifth normal forms are used to further
anomalies
andredundancy
reduce. and the sixth normal form is not yet standardized. - This is enough to follow the normal forms from 0 to 3.5.
Zero Normal Forms
- it does not represent a level of normalization. Instead, it refers to the initial state of data organization where no normalization has been applied.
- 0NF, data may be unstructured, stored in a non-relational manner
The first Normal Form
1NF
requires that each column in a table containsatomic values
, meaning each value is indivisible. It eliminatesrepeating groups
and ensures that each row and column contains only one value.
- In this table, the columns “product_one,” “product_two,” and “product_three” represent a repeating group because they store multiple values for products. This violates the first normal form because the table should ideally have a separate table for products associated with an order, rather than storing them in repeating columns.
To illustrate 1NF with atomic values, let’s consider an example of a table representing students and their courses:
- In this example, the “Courses” column violates 1NF because it contains multiple values within a single cell. To bring this table to 1NF, we need to separate the multiple courses into distinct values
Second Normal Form
- 2NF builds upon 1NF by addressing
functional dependencies
. It ensures that non-key attributes depend on the entire primary key.
Third Norma Form
- 3NF further updates the normalization process by addressing
transitive dependencies
. It ensures thatnon-key attributes
depend solely on theprimary key
and not on other non-key attributes. To achieve 3NF, anytransitive dependencies
are eliminated by splitting the table into additional tables.
Transitive Dependency example: A is functionally dependent on B, and B is functionally dependent on C, then C is transitively dependent on A via B.
Boyce Codd Normal Form
Boyce Codd NF
is a Normal Form, that differs from the Third NF. because most of the third normal form tables are already Boyce Codd’s normal form.- Basically, It is an extension of the Third Normal Form (3NF) and focuses on eliminating
functional dependencies
that are not directly related to theprimary key
and are not dependent on anysuperkeys
.
- Here in the above example, the “department_manager” attribute is functionally dependent on the “department” attribute rather than the primary key (“Employee ID”). So we need to decompose it into multiple tables.
Why fourth, fifth, and sixth NF is not useful?
- Generally, these normal forms are not used in database design. The sixth normal form is not yet used in real-world projects, because that is not yet standardized.
- The fourth and fifth normal forms further achieve normalization by removing more anomalies and more redundancy. In a real-world project, you may not see the fourth and fifth NF in normalization.
- Because in most scenario these types of normalization is not required.
Thanks for reading this article!