0% found this document useful (0 votes)
286 views33 pages

Normalization 1st To 5th NF With Example

This document provides an overview of database management systems and the normalization process. It defines key concepts such as files, databases, database anomalies including update, insertion and deletion anomalies. It then explains the goals and various normal forms of normalization from 1st normal form to 5th normal form including 2nd, 3rd (Boyce-Codd normal form) and 4th normal forms. Examples are provided to illustrate each normal form.

Uploaded by

sjkushwaha21
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
286 views33 pages

Normalization 1st To 5th NF With Example

This document provides an overview of database management systems and the normalization process. It defines key concepts such as files, databases, database anomalies including update, insertion and deletion anomalies. It then explains the goals and various normal forms of normalization from 1st normal form to 5th normal form including 2nd, 3rd (Boyce-Codd normal form) and 4th normal forms. Examples are provided to illustrate each normal form.

Uploaded by

sjkushwaha21
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 33

Database Management Systems

Presented By:Deepshikha Jaiswal (2011MCA022) Sourabh Kumar Kushawha (2011MCA020)

File System
File: A collection of records or documents dealing with one organization, person, area or subject.
Manual (paper) files Computer files

Drawbacks of File Systems


Data Redundancy Data Redundancy means same information is duplicated in several files. Data Inconsistency That means different copies of same basic data are existing. This occurs as the result of update operations that are not updating the same data stored at different places. Difficulty in Accessing Data It is not easy to retrieve information using a conventional file processing system. Data Isolation Data are scattered in various files, and the files may be in different format, writing new application program to retrieve data is difficult. Continue

Continued
Integrity Problems The data values may need to satisfy some integrity constraints. For example the balance field Value must be grater than 5000. We have to handle this through program code in file processing systems. But in database we can declare the integrity constraints along with definition itself.

Concurrent Access anomalies If multiple users are updating the same data simultaneously it will result in inconsistent data state. In file processing system it is very difficult to handle this using program code. This results in concurrent access anomalies.
Security Problems Enforcing Security Constraints in file processing system is very difficult as the application programs are added to the system in an ad-hoc manner.

Database System
To overcome the drawbacks of File System we introduced the concept of Database Systems.

Definition:A Database is a collection of data that represents the part of real world or A collection of similar records with relationships between the records. It is usually managed by a program system called a Database Management Systems or DBMS. E.g.:- Bibliographic, statistical, business data, images, etc.

Database Anomalies
Database anomalies are the problems in relations that occur due to redundancy in the relations. These anomalies affect the process of inserting, deleting and modifying data in the relations. Some important data may be lost if a relations is updated that contains database anomalies. It is important to remove these anomalies in order to perform different processing on the relations without any problem. When an attempt is made to modify (update, insert into, or delete from) a table, undesired side-effects may follow. Not all tables can suffer from these side-effects; rather, the sideeffects can only arise in tables that have not been sufficiently normalized.

Types of Anomalies:
There are three types of Anomalies, which are: UPDATION ANOMALY: - Any change made to your
data will require you to scan all records to make the changes multiple time.

INSERTION ANOMALY: - When you insert a record


without having it stored on the related record.

DELETION ANOMALY: - When you delete some


information and lose valuable related information at the same time.

Example: - Update Anomaly

Each record in an "Employees' Skills" table might contain an Employee ID, Employee Address, and Skill; thus a change of address for a particular employee will potentially need to be applied to multiple records (one for each of his skills). If the update is not carried through successfullyif, that is, the employee's address is updated on some records but not others then the table is left in an inconsistent state. Specifically, the table provides conflicting answers to the question of what this particular employee's address is. This phenomenon is known as an update anomaly.

Example: - Insertion Anomaly

Each record in a "Faculty and Their Courses" table might contain a Faculty ID, Faculty Name, Faculty Hire Date, and Course Codethus we can record the details of any faculty member who teaches at least one course, but we cannot record the details of a newly hired faculty member who has not yet been assigned to teach any courses except by setting the Course Code to null. This phenomenon is known as an insertion anomaly.

Example: - Deletion Anomaly

Under certain circumstances, deletion of data representing certain facts necessitates deletion of data representing completely different facts. The "Faculty and Their Courses" table described in the previous example suffers from this type of anomaly, for if a faculty member temporarily ceases to be assigned to any courses, we must delete the last of the records on which that faculty member appears, effectively also deleting the faculty member. This phenomenon is known as a deletion anomaly.

WHAT IS NORMALISATION??
Normalization process has proposed by E. F. Codd in (1972). Normalisation is a technique used for designing relational database tables to minimise duplication of information. Data is normalised in order to redundancy and inconsistency, and to make it easier to maintain.

Normalization generally involves splitting existing tables into multiple ones, which must be re-joined or linked each time a query is issued.

Normalisation provides rules that help:


Organise data efficiently Ensure that only related data are stored in the table.

There are two goals of the normalization process:


Eliminate redundant data (for example, storing the same data in more than one table) and Ensure data dependencies make sense. Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored.

The entire normalization process is based upon:-

The analysis of tables, Their keys (Primary, foreign, candidate, etc.),

Their schemas,
Their functional dependency.

Normal form is a state of a relation that result by decomposing that relation for a design to avoid redundancy. The data base community has developed a series of guidelines for ensuring that database are normalised. These are referred to as normal forms and numbered from one to five. In practical application we will often see 1NF, 2NF, 3NF and BCNF. Along with the occasionally 4NF and 5NF is very rarely seen.

1st

Normal Form

A relation scheme is said to be in 1NF if only one value is associated with each attribute and the value of that attribute is not a set of values or a list of values. Atomic data values; eliminates duplicate columns from the same table. Each row is uniquely identified; Needs a primary key, so each row can be unique. Each field name is also unique.

Example: -

COURSE DATABASE

STUDENT BOB JOE SUE TIM MARY

COURSE DATABASE DATABASE

STUDENT BOB JOE

MATH

DATABASE
MATH MATH

SUE
TIM MARY

2nd

Normal Form

In order to be in Second Normal Form, a relation must first fulfill the requirements to be in First Normal Form. Additionally, each non-key attribute in the relation must be functionally dependent upon the primary key. The rules for second normal form are: The table must already be in first normal form. Non key attributes must depend on every part of the primary key.

Example: Order # 1 2 3 4 Customer Acme Widgets ABC Corporation Acme Widgets Acme Widgets Contact Person John Doe Fred Flintstone John Doe John Doe Total $134.23 $521.24 $1042.42 $928.53

The relation is in First Normal Form, but not Second Normal Form:

Customer Acme Widgets ABC Corporation


Order # 1 2 3 4

Contact Person John Doe Fred Flintstone


Customer Total $134.23 $521.24 $1042.42 $928.53

Acme Widgets ABC Corporation Acme Widgets Acme Widgets

Comment
The creation of two separate tables eliminates the dependency problem experienced in the previous case. In the first table, contact person is dependent upon the primary key -- customer name. The second table only includes the information unique to each order.

Someone interested in the contact person for each order could obtain this information by performing a JOIN Operation.

Another Example of 2NF Normalization: Normalizing EMP_PROJ into 2NF relations.


EMP_PROJ SSN
FD1 FD2

PNUMBER

HOURS

ENAME

PNAME

PLOCATION

FD3

2NF NORMALIZATION
EP1
SSN
FD1

EP2
PNUMBER HOURS SSN
FD2

EP3
ENAME
PNUMBER FD3 PNAME PLOCATION

3rd

Normal Form

Any transitive dependencies have been removed in 3NF.

Meet all the requirements of the second normal form.


Remove columns that are not dependent upon the primary key.

Example: Normalizing EMP_DEPT into 3NF relations.


EMP_DEPT ENAME SSN BDATE ADDRESS DNUMBER DNAME DMGRSSN

3NF NORMALIZATION

ED1
ENAME SSN BDATE ADDRESS DNUMBER

ED2
DNUMBER DNAME DMGRSSN

BoyceCodd

Normal Form

A relation is in BCNF, if it is in 3NF and All of its determinants (i.e. The attributes upon which other attributes depends) are Candidate keys. OR To convert a 3NF into BCNF, decompose such, that every determinant becomes a candidate key.

Example: INTERVIEW TABLE


CANDIDATE_ID INT_DATE INT_TIME INTVR_ID ROOM_NO

Here, above CANDIDATE_ID, ROOM_NO and INTVR_ID all have the property of being Primary Key or can say all are Candidate Key into the INTERVIEW TABLE. So, as definition of BCNF, we will break it into two different tables, INTERVIEW TABLE and ROOM TABLE. INTERVIEW TABLE CANDIDATE_ID INT_DATE INT_TIME INTVR_ID

ROOM TABLE

INTVR_ID

INT_DATE ROOM_NO

4th

Normal Form

Multi-valued dependencies are removed, i.e. isolation of independent multiple relationships.


Meet all the requirements of the third normal form. A relation is in 4NF if it has no multi-valued dependencies.

Example: EMPLOYEE TABLE

EMPLOYEE
NAME
Alexis Alexis Alexis Alexis Mathews Mathews Mathews Mathews

PROJECT
Microsoft Oracle Microsoft Oracle Intel Sybase Intel Sybase

HOBBY
Reading Music Music Reading Movies Riding Riding Movies

Project and Hobby Table

PROJECT
NAME Alexis Alexis Mathews PROJECT Microsoft Oracle Intel

HOBBY
NAME Alexis Alexis Mathews Mathews HOBBY Reading Music Movies Riding

Mathews

Sybase

5th

Normal Form

Any remaining anomalies are removed. In this normal form we isolate semantically related multiple relationships.
The criteria of 5thNF is also known as PJNF( Project join normal form) and JPNF(Join Projection Normal Form).
The table must be in 4NF Their must be no non trivial joint dependency that do not follow from key constraints . The 4NF table is said to be 5NF if and only if every joint dependency in it is implied by the candidate keys.

Example:Dealers sell Product which can be manufactured by various companies. Dealers in order to sell the Product should be registered with the Company. So these three entities have a mutual relationship within them.
DEALERS JM Associate Shiv networks Star Sellers PRODUCT Sweets Shoes Magazine COMPANIES Cadbury Nike Times

Hari Publishers

Books

KM Publication

The above table shows some sample data. If you observe closely, a single record is created using lot of small information. For instance: JM Associate can sell sweets under the following two conditions: JM Associate should be an authorized dealer of Cadbury. Sweets should be manufactured by Cadbury company.
Continue

Continued

These two smaller bits of information form one record of the above given table. So in order for the above information to be Fifth Normal Form all the smaller information should be in three different places. Below is the complete fifth normal form of the database.
DEALERS JM Associate PRODUCT Sweets DEALERS JM Associates Shiv Networks COMPANIES Cadbury Nike

Shiv Networks
Star Sellers

Shoes
Magazine

Star Sellers
Hari Publishers

Times
KM Publications

Hari Publishers Books PRODUCT Sweets Shoes

COMPANIES Cadbury Nike

Magazine
Books

Times
KM Publications

You might also like