Thursday 4 November 2010

Normalisation

2NF

When a 1NF table has no composite candidate keys (candidate keys consisting of more than one attribute), the table is automatically in 2NF.

A table for which there are no partial functional dependencies on the primary key is typically, but not always, in 2NF. In addition to the primary key, the table may contain other candidate keys; it is necessary to establish that no non-prime attributes have part-key dependencies on any of these candidate keys.
Multiple candidate keys occur in the following table:

Electric Toothbrush Models
Manufacturer
Model
Model Full Name
Manufacturer Country
Forte
X-Prime
Forte X-Prime
Italy
Forte
Ultraclean
Forte Ultraclean
Italy
Dent-o-Fresh
EZbrush
Dent-o-Fresh EZBrush
USA
Kobayashi
ST-60
Kobayashi ST-60
Japan
Hoch
Toothmaster
Hoch Toothmaster
Germany
Hoch
X-Prime
Hoch X-Prime
Germany

Even if the designer has specified the primary key as {Model Full Name}, the table is not in 2NF. {Manufacturer, Model} is also a candidate key, and Manufacturer Country is dependent on a proper subset of it: Manufacturer. To make the design conform to 2NF, it is necessary to have two tables:

Electric Toothbrush Manufacturers
Manufacturer
Manufacturer Country
Forte
Italy
Dent-o-Fresh
USA
Kobayashi
Japan
Hoch
Germany

Electric Toothbrush Models
Manufacturer
Model
Model Full Name
Forte
X-Prime
Forte X-Prime
Forte
Ultraclean
Forte Ultraclean
Dent-o-Fresh
EZbrush
Dent-o-Fresh EZBrush
Kobayashi
ST-60
Kobayashi ST-60
Hoch
Toothmaster
Hoch Toothmaster
Hoch
X-Prime
Hoch X-Prime

 

BCNF

Only in rare cases does a 3NF table not meet the requirements of BCNF. A 3NF table which does not have multiple overlapping candidate keys is guaranteed to be in BCNF.
Depending on what its functional dependencies are, a 3NF table with two or more overlapping candidate keys may or may not be in BCNF.

A relation is in BCNF, if and only if, every determinant is a candidate key.


4NF

Multivalued Dependency
If the column headings in a relational database table are divided into three disjoint groupings X, Y, and Z, then, in the context of a particular row, we can refer to the data beneath each group of headings as x, y, and z respectively. A multivalued dependency X-->>Y signifies that if we choose any x actually occurring in the table (call this choice xc), and compile a list of all the xcyz combinations that occur in the table, we will find that xc is associated with the same y entries regardless of z.

A trivial multivalued dependency X-->>Y is one where either Y is a subset of X, or X and Y together form the whole set of attributes of the relation.

A functional dependency is a special case of multivalued dependency. In a functional dependency X-->Y, every x determines exactly one y, never more than one.


Consider this example of a database of teaching courses, the books recommended for the course, and the lecturers who will be teaching the course:

Teaching database
Course
Book
Lecturer
AHA
Silberschatz
John D
AHA
Nederpelt
William M
AHA
Silberschatz
William M
AHA
Nederpelt
John D
AHA
Silberschatz
Christian G
AHA
Nederpelt
Christian G
OSO
Silberschatz
John D
OSO
Silberschatz
William M

Because the lecturers attached to the course and the books attached to the course are independent of each other, this database design has a multivalued dependency; if we were to add a new book to the AHA course, we would have to add one record for each of the lecturers on that course, and vice versa.
Put formally, there are two multivalued dependencies in this relation: {course}-->>{book} and equivalently {course}-->>{lecturer}.


Example
For an example see the ‘Example’ section on http://en.wikipedia.org/wiki/Fourth_normal_form


My Normalisation Guide

1NF
All attributes must be atomic.
All instances of an entity must contain the same number of values.
All instances of an entity must be different from one another.
2NF
Only relevant when a composite candidate key (primary key or other candidate key) exists in the entity.
The entity must be in 1NF.
All attributes must be a fact about the entire key.
3NF
The entity must be in 2NF.
Non-key attributes cannot describe other non-key attributes.
BCNF
Only in rare cases 3NF entity does not satisfy BCNF (3NF table with two or more overlapping candidate keys may or may not be in BCNF).
Every determinant is a key.
If a table has no non-key attributes then it satisfy all previous normal forms up to BCNF.
4NF
The entity must be in BCNF.
Either every multivalued dependency X -->> Y is trivial or for every nontrivial multivalued dependency X -->> Y, X is a superkey (or in other words, no attribute can have multiple values issue in the entity).
5NF
The entity must be in 4NF.
Try to breakdown any ternary relationship into binary relationships.