Normalization is the process of designing a data model to efficiently store data in a database. The end result is that redundant data is eliminated, and only data related to the attribute is stored within the table.
For example,
let's say we store City, State and ZipCode data for Customers in the same table as Other Customer data. With this approach, we keep repeating the City, State and ZipCode data for all Customers in the same area. Instead of storing the same data again and again, we could normalize the data and create a related table called City.
The "City" table could then store City, State and ZipCode along with IDs that relate back to the Customer table, and we can eliminate those three columns from the Customer table and add the new ID column.
Normalization rules have been broken down into several forms. People often refer to the third normal form (3NF) when talking about database design.
This is what most database designers try to achieve: In the conceptual stages, data is segmented and normalized as much as possible, but for practical purposes those segments are changed during the evolution of the data model.
Various normal forms may be introduced for different parts of the data model to handle the unique situations you may face.
Whether you have heard about normalization or not, your database most likely follows some of the rules, unless all of your data is stored in one giant table.
We will take a look at the first three normal forms and the rules for determining the different forms here.
1NF - A relation R is in first normal form (1NF) if and only if all underlying domains contain atomic values only.
- A relation is said to be in First Normal Form (1NF) if and only if each attribute of the relation is atomic. More simply, to be in 1NF, each column must contain only a single value and each row must contain the same columns
Example: 1NF but not 2NF
FIRST(Supplier_no,Status,City, part_No,quantity)
- 2NF - A relation R is in second normal form (2NF) if and only if it is in 1NF and every non-key attribute is fully dependent on the primary key.
An entity is in the second normal form if all of its attributes depend on the whole (primary) key. In relational terms, every column in a table must be functionally dependent on the whole primary key of that table. Functional dependency indicates that a link exists between the values in two different columns.
Example (2NF but not 3NF) :
SECOND (supplier_no, status, city).
Example (2NF but not 3NF) :
SECOND (supplier_no, status, city).
3NF - A relation R is in third normal form (3NF) if and only if it is in 2NF and every non-key attribute is non-transitively dependent on the primary key. An attribute C is transitively dependent on attribute A if there exists an attribute B such that: A ® B and B ® C. Note that 3NF is concerned with transitive dependencies which do not involve candidate keys. A 3NF relation with more than one candidate key will clearly have transitive dependencies of the form: primary_key ®other_candidate_key ® any_non-key_column
Example (3NF but not BCNF) :
SUPPLIER_PART (supplier_no, supplier_name, part_no, quantity)