Normalization 1st To 5th NF With Example
Normalization 1st To 5th NF With Example
File System
File: A collection of records or documents dealing with one organization, person, area or subject.
Manual (paper) files Computer files
Continued
Integrity Problems The data values may need to satisfy some integrity constraints. For example the balance field Value must be grater than 5000. We have to handle this through program code in file processing systems. But in database we can declare the integrity constraints along with definition itself.
Concurrent Access anomalies If multiple users are updating the same data simultaneously it will result in inconsistent data state. In file processing system it is very difficult to handle this using program code. This results in concurrent access anomalies.
Security Problems Enforcing Security Constraints in file processing system is very difficult as the application programs are added to the system in an ad-hoc manner.
Database System
To overcome the drawbacks of File System we introduced the concept of Database Systems.
Definition:A Database is a collection of data that represents the part of real world or A collection of similar records with relationships between the records. It is usually managed by a program system called a Database Management Systems or DBMS. E.g.:- Bibliographic, statistical, business data, images, etc.
Database Anomalies
Database anomalies are the problems in relations that occur due to redundancy in the relations. These anomalies affect the process of inserting, deleting and modifying data in the relations. Some important data may be lost if a relations is updated that contains database anomalies. It is important to remove these anomalies in order to perform different processing on the relations without any problem. When an attempt is made to modify (update, insert into, or delete from) a table, undesired side-effects may follow. Not all tables can suffer from these side-effects; rather, the sideeffects can only arise in tables that have not been sufficiently normalized.
Types of Anomalies:
There are three types of Anomalies, which are: UPDATION ANOMALY: - Any change made to your
data will require you to scan all records to make the changes multiple time.
Each record in an "Employees' Skills" table might contain an Employee ID, Employee Address, and Skill; thus a change of address for a particular employee will potentially need to be applied to multiple records (one for each of his skills). If the update is not carried through successfullyif, that is, the employee's address is updated on some records but not others then the table is left in an inconsistent state. Specifically, the table provides conflicting answers to the question of what this particular employee's address is. This phenomenon is known as an update anomaly.
Each record in a "Faculty and Their Courses" table might contain a Faculty ID, Faculty Name, Faculty Hire Date, and Course Codethus we can record the details of any faculty member who teaches at least one course, but we cannot record the details of a newly hired faculty member who has not yet been assigned to teach any courses except by setting the Course Code to null. This phenomenon is known as an insertion anomaly.
Under certain circumstances, deletion of data representing certain facts necessitates deletion of data representing completely different facts. The "Faculty and Their Courses" table described in the previous example suffers from this type of anomaly, for if a faculty member temporarily ceases to be assigned to any courses, we must delete the last of the records on which that faculty member appears, effectively also deleting the faculty member. This phenomenon is known as a deletion anomaly.
WHAT IS NORMALISATION??
Normalization process has proposed by E. F. Codd in (1972). Normalisation is a technique used for designing relational database tables to minimise duplication of information. Data is normalised in order to redundancy and inconsistency, and to make it easier to maintain.
Normalization generally involves splitting existing tables into multiple ones, which must be re-joined or linked each time a query is issued.
Their schemas,
Their functional dependency.
Normal form is a state of a relation that result by decomposing that relation for a design to avoid redundancy. The data base community has developed a series of guidelines for ensuring that database are normalised. These are referred to as normal forms and numbered from one to five. In practical application we will often see 1NF, 2NF, 3NF and BCNF. Along with the occasionally 4NF and 5NF is very rarely seen.
1st
Normal Form
A relation scheme is said to be in 1NF if only one value is associated with each attribute and the value of that attribute is not a set of values or a list of values. Atomic data values; eliminates duplicate columns from the same table. Each row is uniquely identified; Needs a primary key, so each row can be unique. Each field name is also unique.
Example: -
COURSE DATABASE
MATH
DATABASE
MATH MATH
SUE
TIM MARY
2nd
Normal Form
In order to be in Second Normal Form, a relation must first fulfill the requirements to be in First Normal Form. Additionally, each non-key attribute in the relation must be functionally dependent upon the primary key. The rules for second normal form are: The table must already be in first normal form. Non key attributes must depend on every part of the primary key.
Example: Order # 1 2 3 4 Customer Acme Widgets ABC Corporation Acme Widgets Acme Widgets Contact Person John Doe Fred Flintstone John Doe John Doe Total $134.23 $521.24 $1042.42 $928.53
The relation is in First Normal Form, but not Second Normal Form:
Comment
The creation of two separate tables eliminates the dependency problem experienced in the previous case. In the first table, contact person is dependent upon the primary key -- customer name. The second table only includes the information unique to each order.
Someone interested in the contact person for each order could obtain this information by performing a JOIN Operation.
PNUMBER
HOURS
ENAME
PNAME
PLOCATION
FD3
2NF NORMALIZATION
EP1
SSN
FD1
EP2
PNUMBER HOURS SSN
FD2
EP3
ENAME
PNUMBER FD3 PNAME PLOCATION
3rd
Normal Form
3NF NORMALIZATION
ED1
ENAME SSN BDATE ADDRESS DNUMBER
ED2
DNUMBER DNAME DMGRSSN
BoyceCodd
Normal Form
A relation is in BCNF, if it is in 3NF and All of its determinants (i.e. The attributes upon which other attributes depends) are Candidate keys. OR To convert a 3NF into BCNF, decompose such, that every determinant becomes a candidate key.
Here, above CANDIDATE_ID, ROOM_NO and INTVR_ID all have the property of being Primary Key or can say all are Candidate Key into the INTERVIEW TABLE. So, as definition of BCNF, we will break it into two different tables, INTERVIEW TABLE and ROOM TABLE. INTERVIEW TABLE CANDIDATE_ID INT_DATE INT_TIME INTVR_ID
ROOM TABLE
INTVR_ID
INT_DATE ROOM_NO
4th
Normal Form
EMPLOYEE
NAME
Alexis Alexis Alexis Alexis Mathews Mathews Mathews Mathews
PROJECT
Microsoft Oracle Microsoft Oracle Intel Sybase Intel Sybase
HOBBY
Reading Music Music Reading Movies Riding Riding Movies
PROJECT
NAME Alexis Alexis Mathews PROJECT Microsoft Oracle Intel
HOBBY
NAME Alexis Alexis Mathews Mathews HOBBY Reading Music Movies Riding
Mathews
Sybase
5th
Normal Form
Any remaining anomalies are removed. In this normal form we isolate semantically related multiple relationships.
The criteria of 5thNF is also known as PJNF( Project join normal form) and JPNF(Join Projection Normal Form).
The table must be in 4NF Their must be no non trivial joint dependency that do not follow from key constraints . The 4NF table is said to be 5NF if and only if every joint dependency in it is implied by the candidate keys.
Example:Dealers sell Product which can be manufactured by various companies. Dealers in order to sell the Product should be registered with the Company. So these three entities have a mutual relationship within them.
DEALERS JM Associate Shiv networks Star Sellers PRODUCT Sweets Shoes Magazine COMPANIES Cadbury Nike Times
Hari Publishers
Books
KM Publication
The above table shows some sample data. If you observe closely, a single record is created using lot of small information. For instance: JM Associate can sell sweets under the following two conditions: JM Associate should be an authorized dealer of Cadbury. Sweets should be manufactured by Cadbury company.
Continue
Continued
These two smaller bits of information form one record of the above given table. So in order for the above information to be Fifth Normal Form all the smaller information should be in three different places. Below is the complete fifth normal form of the database.
DEALERS JM Associate PRODUCT Sweets DEALERS JM Associates Shiv Networks COMPANIES Cadbury Nike
Shiv Networks
Star Sellers
Shoes
Magazine
Star Sellers
Hari Publishers
Times
KM Publications
Magazine
Books
Times
KM Publications