6.2 Normalization: 1NF, 2NF, 3NF
Normalization is the process of organizing data within a database to minimize redundancy and dependency, making it more efficient and easier to maintain. It involves dividing larger tables into smaller ones and defining relationships between them. The three most common normal forms (NF) are First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF).
1. First Normal Form (1NF)
A table is in First Normal Form (1NF) if it satisfies the following conditions:
- Atomic Values: All columns contain atomic (indivisible) values. Each field must hold a single value, not a list or set of values.
- Unique Rows: Each row must be unique. There should be a primary key (or a combination of keys) that uniquely identifies each row.
- No Repeating Groups: There must be no repeating groups of columns. Each column should contain only one attribute, and if there are multiple attributes, they should be placed in separate rows.
Example of 1NF Violation:
StudentID | Name | Subjects |
---|---|---|
1 | Alice | Math, English |
2 | Bob | History, Science |
The Subjects column contains multiple values (Math, English) which violates 1NF.
Conversion to 1NF:
StudentID | Name | Subject |
---|---|---|
1 | Alice | Math |
1 | Alice | English |
2 | Bob | History |
2 | Bob | Science |
Now, each value is atomic, and there are no repeating groups. The table is in 1NF.
2. Second Normal Form (2NF)
A table is in Second Normal Form (2NF) if:
- It is in 1NF.
- There is no partial dependency. This means that non-key columns should depend on the entire primary key, not just a part of it. In other words, if the primary key is composite (consisting of more than one column), no non-key attribute should depend on just one part of the primary key.
Example of 2NF Violation:
Consider the following table with a composite primary key (StudentID, CourseID):
StudentID | CourseID | Instructor | Room |
---|---|---|---|
1 | 101 | Mr. Smith | Room 1 |
1 | 102 | Mrs. Jones | Room 2 |
2 | 101 | Mr. Smith | Room 1 |
Here, Instructor and Room depend only on CourseID, not on the whole primary key (StudentID, CourseID). This is a partial dependency.
Conversion to 2NF:
To resolve this, we separate the course-related information (like Instructor and Room) into a new table, as it depends only on CourseID:
Student_Course Table:
StudentID | CourseID |
---|---|
1 | 101 |
1 | 102 |
2 | 101 |
Course Details Table:
CourseID | Instructor | Room |
---|---|---|
101 | Mr. Smith | Room 1 |
102 | Mrs. Jones | Room 2 |
Now, the Student_Course table only stores the relationship between StudentID and CourseID, and the Course Details table stores the information that depends solely on CourseID. This resolves the partial dependency, and the tables are now in 2NF.
3. Third Normal Form (3NF)
A table is in Third Normal Form (3NF) if:
- It is in 2NF.
- There is no transitive dependency. This means that non-key columns should depend only on the primary key and not on other non-key columns.
Example of 3NF Violation:
Consider the following table:
StudentID | CourseID | Instructor | Instructor_Phone |
---|---|---|---|
1 | 101 | Mr. Smith | 1234567890 |
2 | 102 | Mrs. Jones | 0987654321 |
Here, the Instructor_Phone depends on Instructor, not directly on the StudentID or CourseID combination. This is a transitive dependency because Instructor_Phone is indirectly dependent on the primary key via Instructor.
Conversion to 3NF:
To resolve this, we remove the transitive dependency by separating the Instructor_Phone into a new table:
Student_Course Table:
StudentID | CourseID | Instructor |
---|---|---|
1 | 101 | Mr. Smith |
2 | 102 | Mrs. Jones |
Instructor Table:
Instructor | Instructor_Phone |
---|---|
Mr. Smith | 1234567890 |
Mrs. Jones | 0987654321 |
Now, Instructor_Phone is only dependent on Instructor, and all non-key attributes depend directly on the primary key of their respective tables. The tables are now in 3NF.
Summary of Normal Forms:
Normal Form | Conditions | Goal |
---|---|---|
1NF | - Atomic values- Unique rows- No repeating groups | Eliminate duplicate data and ensure each field has a single value. |
2NF | - In 1NF- No partial dependencies (non-key attributes depend on the entire primary key) | Eliminate partial dependencies. |
3NF | - In 2NF- No transitive dependencies (non-key attributes depend only on the primary key) | Eliminate transitive dependencies. |
Benefits of Normalization:
- Eliminates Redundancy: By breaking data into smaller tables, normalization reduces data duplication.
- Improves Data Integrity: Ensures consistency and accuracy by minimizing the chances of errors or inconsistencies.
- Optimizes Query Performance: Smaller, well-structured tables can speed up queries and make data management easier.
Normalization is a critical aspect of database design as it ensures efficient storage, retrieval, and integrity of data. By applying 1NF, 2NF, and 3NF, database designers can create databases that are more scalable and easier to maintain.
Commenting is not enabled on this course.