0% found this document useful (0 votes)
351 views11 pages

Lecture Notes DBM S

The document discusses databases, what data is, why databases are needed, how data is organized into tables, and who uses databases. Some key points: - A database is a place where all types of data are stored, including information about students, courses, colors, heights, weights, etc. - Databases are needed to store and manipulate related data in an organized way, allowing users to easily record, access, update, and analyze the data. - Data is organized into logical groupings called tables. Related data like students, teachers, classes are stored separately but can be linked together. - End users of databases include students, teachers, and others who rely on the stored data for their

Uploaded by

Romeo Balingao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
351 views11 pages

Lecture Notes DBM S

The document discusses databases, what data is, why databases are needed, how data is organized into tables, and who uses databases. Some key points: - A database is a place where all types of data are stored, including information about students, courses, colors, heights, weights, etc. - Databases are needed to store and manipulate related data in an organized way, allowing users to easily record, access, update, and analyze the data. - Data is organized into logical groupings called tables. Related data like students, teachers, classes are stored separately but can be linked together. - End users of databases include students, teachers, and others who rely on the stored data for their

Uploaded by

Romeo Balingao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Database

Name itself indicates what it is. Database is a place/container where all the data is stored. But what
is data? In a database, even a smallest piece of information becomes data. For example, Student is
a data, course is a data, and Color is a data, height, weight, food everything is data. In short, all the
living and non-living objects in this world is a data.

Why we need data?


We need data so that we can perform various actions on them. Say, we do not have any database
and we want to record what is the height and weight of a baby in a year. What we do is note it in a
piece of paper every month. At the end of the certain period or year, we would check up if he/she is
growing correctly. If some entry is wrong or irrelevant, we correct it or strike it off. Same is done
using database. We would be storing all these information in the database. If we want to check the
growth, we would be pulling the information from the database, if we need to change any
information, we can update/delete them. But all the data will be at one place - Database.

In a database, we would be grouping only related data together and storing them under one group
name called table. This helps in identifying which data is stored where and under what name. It
reduces the time to search for a particular data in a whole database. For example, Student, Teacher,
Class, Subject, Employee, Department etc form individual tables.

And for whom these datas are stored?


We store only related data - related to one particular requirement / application. For example, Student
database - it will have all the information of students ranging from his ID, Name, Date of birth, class,
to grade, prizes who are studying in a particular College.

How do we determine which data is relevant to be put in a


particular database?
It all depends on what database we are developing, and what is the exact requirement/purpose of it.
Say, we need to create College database. What could college database contain? First thing is that
we need to store college information like its name, address. Next comes courses offered in that
college, Staffs and their details, students and their details. But do we store all these information
under one table - College? Will database be quick in getting the data or updating? Certainly Not! It
would become a chaos if everything is stored in a single table. Hence they introduce certain rules to
manage the database - relational database management system (RDBMS). RDBMS is a program
that guides us how to create and maintain a database. It tells us how to divide related information
into different tables and inter-relate them so that we can select/insert/update/delete all the related
data easily and efficiently.

Characteristic of a good database are:

1. Should be able to store all kinds of data that exists in this real world. Since we need to
work with all kinds of data and requirements, database should be strong enough to store
all kinds of data that is present around us.
2. Should be able to relate the entities / tables in the database by means of a relation. i.e.;
any two tables should be related. Let us say, an employee works for a department. This
implies that Employee is related to a particular department. We should be able to define
such a relationship between any two entities in the database. There should not be any
table lying without any mapping.
3. Data and application should be isolated. Because database is a system which gives the
platform to store the data, and the data is the one which allows the database to work.
Hence there should be clear differentiation between them.
4. There should not be any duplication of data in the database. Data should be stored in
such a way that it should not be repeated in multiple tables. If repeated, it would be
unnecessary waste of DB space and maintaining such data becomes chaos.
5. DBMS has a strong query language. Once the database is designed, this helps the user to
retrieve and manipulate the data. If a particular user wants to see any specific data, he
can apply as many filtering conditions that he wants and pull the data that he needs.
6. Multiple users should be able to access the same database, without affecting the other
user. i.e.; if teachers want to update a students marks in Results table at the same time,
then they should be allowed to update the marks for their subjects, without modifying other
subject marks. A good database should support this feature.
7. It supports multiple views to the user, depending on his role. In a school database,
Students will able to see only their reports and their access would be read only. At the
same time teachers will have access to all the students with the modification rights. But
the database is the same. Hence a single database provides different views to different
users.
8. Database should also provide security, i.e.; when there are multiple users are accessing
the database, each user will have their own levels of rights to see the database. Some of
them will be allowed to see whole database, and some will have only partial rights. For
example, instructor who is teaching Physics will have access to see and update marks of
his subject. He will not have access for other subjects. But the HOD will have full access
on all the subjects.
9. Database should also support ACID property. i.e.; while performing any transactions like
insert, update and delete, database makes sure that the real purpose of the data is not
lost. For example, if a students address is updated, then it should make sure that there is
no duplicate data is created nor there is any data mismatch for that student.
As we now know what is a database, who would be the users of database? Of course the
developers will be using this database to design and develop. Who else? There would be an
administrator, who keeps watching the database for its usages, who is accessing it, giving access to
other users, limiting the security for the users, and any other maintenance work of the database. And
there is one more end users. These end users are the real group of people who really uses the
database and takes the advantages of database. In School database, teachers, students are the end
users, who really uses the database in their daily needs.

Where all are these database used? Everywhere!! Now a day, database is used in each and every
place. We can see the use of database in supermarkets, stock exchange, college, library, ATMs,
offices, banks, hospitals etc.

Data Independence

Database consists on different objects like schema, tables, views, constants, cursors, procedures,
functions, packages, synonyms etc. They have their specifications, tasks and value in the database.
But they all differ from what we see on the monitor. i.e.; what we see on the monitor is the user
friendly display of the data. But actual structure and data are stored in different way.

There are storage informations about the data, object structure. There are basic informations about
the objects like their names, columns in them, total number of records, their indexes and constraints,
mapping between the tables, functions/procedures used in packages etc. There are exact values for
each record which are shown to the user. All these informations are different from each other in their
own way. Let us see all of them

Each data value and structure details of the database objects are stored in magnetic tapes,
magnetic disks, optical disks etc. These informations are usually the basic storage information of any
computer. This kind of informations is called physical storage information and is usually lowest level
of informations. They are least known to any programmer. This is called physical level of data.

The informations like table/view names, their columns, indexes and constraints on them, mapping
between the tables are all next level of information related to database. This information defines the
structure of the objects in the database. These are all called logical levels of data. The developer
and the DBA will have the knowledge about this data.
The user will get to see only the data stored in the database. Either they will see whole data values
or any specific records. They will not have any information about how they are stored, what kind of
datatype it has, how many records it has etc. This level of abstraction is called view level.

In a STUDENT table example, records of each student which user sees are view level of
information. Columns, their datatypes, their mapping, and constraints like primary key, foreign key
informations are the logical level of information. The actual structure of table and data are stored in
the servers memory. This is physical level of informations.

Physical level of abstraction is the lowest level of abstraction and view level of abstraction is the
highest level of abstraction. Based on these levels of abstraction, we have two types of data
independence.

Suppose there was a change in memory size of the database servers. This will not affect the logical
structure of any of the objects in the database. They are completely independent of the physical
structure. This is called physical data independence.

Any changes to the database objects like changes to table structure, size or addition/removal of
columns from the table will not affect user views. They will see the data like before. This is called
logical data independence.

By these two types of data independence, the isolation between the physical and logical layer is
achieved. This helps in reducing the time and cost acquired by changes in any one level in the
database. Hence, the main purpose of database to provide abstract view of data is achieved.

Components of a database includes


User: - Users are the one who really uses the database. Users can be administrator, developer or
the end users.

Data or Database: - As we discussed already, data is one of the important factor of database. A
very huge amount of data will be stored in the database and it forms the main source for all other
components to interact with each other. There are two types of data. One is user data. It contains
the data which is responsible for the database, i.e.; based on the requirement, the data will be stored
in the various tables of the database in the form of rows and columns. Another data is Metadata. It is
known as data about data, i.e.; it stores the information like how many tables, their names, how
many columns and their names, primary keys, foreign keys etc. basically these metadata will have
information about each tables and their constraints in the database.

DBMS: - This is the software helps the user to interact with the database. It allows the users to
insert, delete, update or retrieve the data. All these operations are handled by query languages like
MySQL, Oracle etc.

Database Application: - It the application program which helps the users to interact with the
database by means of query languages. Database application will not have any idea about the
underlying DBMS.

Database architecture can be 2-tier or 3 tier architecture based on how users are connected to the
database to get their request done. They can either directly connect to the database or their request
is received by intermediary layer, which synthesizes the request and then it sends to database.

2-tier Architecture
In 2-tier architecture, application program directly interacts with the database. There will not be any
user interface or the user involved with database interaction. Imagine a front end application of
School, where we need to display the reports of all the students who are opted for different subjects.
In this case, the application will directly interact with the database and retreive all required data.
Here no inputs from the user are required. This involves 2-tier architecture of the database.

Let us consider another example of two tier architecture. Consider a railway ticket reservation
system. How does this work? Imagine a person is reserving the ticket from Delhi to Goa on particular
day. At the same time another person in some other place of Delhi is also reserving the ticket to Goa
on the same day for the same train. Now there is a requirement for two tickets, but for different
persons. What will reservation system do? It takes the request from both of them, and queues the
requests entered by each of them. Here the request entered to application layer and request is sent
to database layer. Once the request is processed in database, the result is sent back to application
layer for the user.

Advantages of 2-tier Architecture


Easy to understand as it directly communicates with the database.
Requested data can be retrieved very quickly, when there is less number of users.
Easy to modify any changes required, directly requests can be sent to database
Easy to maintain When there are multiple requests, it will be handled in a queue and there
will not be any chaos.

Disadvantages of 2-tier architecture:


It would be time consuming, when there is huge number of users. All the requests will be
queued and handed one after another. Hence it will not respond to multiple users at the
same time.
This architecture would little cost effective.

3-tier architecture is the most widely used database architecture. It can be viewed as below.

Presentation layer / User layer is the layer where user uses the database. He does not
have any knowledge about underlying database. He simply interacts with the database as
though he has all data in front of him. You can imagine this layer as a registration form
where you will be inputting your details. Did you ever guessed, after pressing submit
button where the data goes? No right? You just know that your details are saved. This is
the presentation layer where all the details from the user are taken, sent to the next layer
for processing.
Application layer is the underlying program which is responsible for saving the details that
you have entered, and retrieving your details to show up in the page. This layer has all the
business logics like validation, calculations and manipulations of data, and then sends the
requests to database to get the actual data. If this layer sees that the request is invalid, it
sends back the message to presentation layer. It will not hit the database layer at all.
Data layer or Database layer is the layer where actual database resides. In this layer, all
the tables, their mappings and the actual data present. When you save you details from
the front end, it will be inserted into the respective tables in the database layer, by using
the programs in the application layer. When you want to view your details in the web
browser, a request is sent to database layer by application layer. The database layer fires
queries and gets the data. These data are then transferred to the browser (presentation
layer) by the programs in the application layer.

Advantages of 3-tier architecture:


Easy to maintain and modify. Any changes requested will not affect any other data in the
database. Application layer will do all the validations.
Improved security. Since there is no direct access to the database, data security is
increased. There is no fear of mishandling the data. Application layer filters out all the
malicious actions.
Good performance. Since this architecture cache the data once retrieved, there is no need to
hit the database for each request. This reduces the time consumed for multiple requests
and hence enables the system to respond at the same time.

Disadvantages 3-tier Architecture


Disadvantages of 3-tier architecture are that it is little more complex and little more effort is required
in terms of hitting the database.

We saw how we can connect to database. But how is the database laid to process all user requests?
Since it is responsible to store huge amount of data and is capable of handling multiple requests
from users simultaneously, it should be arranged properly. One can imagine a database as a brain!
How is the structure of brain? Bit sophisticated and each part of the brain is responsible for some
specific tasks. Similarly, Database is also designed.

At very high level, a database is considered as shown in below diagram. Let us see them in detail
below.

Applications: - It can be considered as a user friendly web page where the user enters the
requests. Here he simply enters the details that he needs and presses buttons to get the
data.
End User: - They are the real users of the database. They can be developers, designers,
administrator or the actual users of the database.
DDL: - Data Definition Language (DDL) is a query fired to create database, schema, tables,
mappings etc in the database. These are the commands used to create the objects like
tables, indexes in the database for the first time. In other words, they create structure of
the database.
DDL Compiler: - This part of database is responsible for processing the DDL commands.
That means these compiler actually breaks down the command into machine
understandable codes. It is also responsible for storing the metadata information like table
name, space used by it, number of columns in it, mapping information etc.
DML Compiler: - When the user inserts, deletes, updates or retrieves the record from the
database, he will be sending request which he understands by pressing some buttons. But
for the database to work/understand the request, it should be broken down to object code.
This is done by this compiler. One can imagine this as when a person is asked some
question, how this is broken down into waves to reach the brain!
Query Optimizer: - When user fires some request, he is least bothered how it will be fired
on the database. He is not all aware of database or its way of performance. But whatever
be the request, it should be efficient enough to fetch, insert, update or delete the data from
the database. The query optimizer decides the best way to execute the user request
which is received from the DML compiler. It is similar to selecting the best nerve to carry
the waves to brain!
Stored Data Manager: - This is also known as Database Control System. It is one the main
central system of the database. It is responsible for various tasks
o It converts the requests received from query optimizer to machine understandable
form. It makes actual request inside the database. It is like fetching the exact part
of the brain to answer.
o It helps to maintain consistency and integrity by applying the constraints. That
means, it does not allow inserting / updating / deleting any data if it has child entry.
Similarly it does not allow entering any duplicate value into database tables.
o It controls concurrent access. If there is multiple users accessing the database at
the same time, it makes sure, all of them see correct data. It guarantees that there
is no data loss or data mismatch happens between the transactions of multiple
users.
o It helps to backup the database and recover data whenever required. Since it is a
huge database and when there is any unexpected exploit of transaction, and
reverting the changes are not easy. It maintains the backup of all data, so that it
can be recovered.
Data Files: - It has the real data stored in it. It can be stored as magnetic tapes, magnetic
disks or optical disks.
Compiled DML: - Some of the processed DML statements (insert, update, delete) are stored
in it so that if there is similar requests, it will be re-used.
Data Dictionary: - It contains all the information about the database. As the name suggests,
it is the dictionary of all the data items. It contains description of all the tables, view,
materialized views, constraints, indexes, triggers etc.
Introduction
In a daily life, we come across various needs to store data. It can be maintaining daily household
bills, bank account details, salary details, payment details, student information, student reports,
books in the library etc. How it will be recorded at one place, so that we can get it back when
required? It should be recorded in such a way that

1. Should be able to get the data any point in time latter


2. Should be able to add details to it whenever required
3. Should be able to modify stored information, as needed
4. Should also be able to delete them

In traditional approach, before to computer, all informations were stored in papers. When we need
information, we used to search through the papers. If we know particular date or category of
information we are searching, we go to that particular session in the papers. When we want update
or delete some data, we search for it and modify them or strike off them. If the data is limited, then all
these tasks are easy. Imagine library information or information about a student in School, or baking
system! How do we search for single required data in papers? It is a never ending task! Yes,
Computers solved our problems.

File Processing System


When computers came, all these jobs become easy. But initial days, these records were stored in
the form of files. The way we stored in files is similar to papers, in the form of flat files to be
simpler, in notepad. Yes, the informations where all in the notepads with each fields of information
separated by space, tab comma, semicolon or any other symbol.

All the files were grouped based on their categories; file used to have only related informations and
each file is named properly. As we can see in the above sample file has Student information.
Student files for each class were bundled inside different folders to identify it quickly.
Now, if we want to see a specific Student detail from a file, what do we do? We know which file will
have the data, we open that file and search for his details. Fine, here we see the files; we can open it
and search for it. But imagine we want to display student details in a UI. Now how will we open a file,
read or update it? There different programs like C, C++, COBOL etc which helps to do this task.
Using these programming languages, we can search for files, open them, search for the data inside
them, and go to specific line in the file, add/update/delete specific information.

Disadvantages of file processing


File processing system is good when there is only limited number of files and data in are very less.
As the data and files in the system grow, handling them becomes difficult.

1. Data Mapping and Access: - Although all the related informations are grouped and
stored in different files, there is no mapping between any two files. i.e.; any two dependent
files are not linked. Even though Student files and Student_Report files are related, they
are two different files and they are not linked by any means. Hence if we need to display
student details along with his report, we cannot directly pick from those two files. We have
to write a lengthy program to search Student file first, get all details, then go
Student_Report file and search for his report.

When there is very huge amount of data, it is always a time consuming task to search for
particular information from the file system. It is always an inefficient method to search for
the data.

2. Data Redundancy: - There are no methods to validate the insertion of duplicate data in
file system. Any user can enter any data. File system does not validate for the kind of data
being entered nor does it validate for previous existence of the same data in the same file.
Duplicate data in the system is not appreciated as it is a waste of space, and always lead
to confusion and mishandling of data. When there are duplicate data in the file, and if we
need to update or delete the record, we might end up in updating/deleting one of the
record, leaving the other record in the file. Again the file system does not validate this
process. Hence the purpose of storing the data is lost.

Though the file name says Student file, there is a chance of entering staff information or
his report information in the file. File system allows any information to be entered into any
file. It does not isolate the data being entered from the group it belongs to.
3. Data Dependence: - In the files, data are stored in specific format, say tab, comma or
semicolon. If the format of any of the file is changed, then the program for processing this
file needs to be changed. But there would be many programs dependent on this file. We
need to know in advance all the programs which are using this file and change in the
entire place. Missing to change in any one place will fail whole application. Similarly,
changes in storage structure, or accessing the data, affect all the places where this file is
being used. We have to change it entire programs. That is smallest change in the file
affect all the programs and need changes in all them.
4. Data inconsistency: - Imagine Student and Student_Report files have students address
in it, and there was a change request for one particular students address. The program
searched only Student file for the address and it updated it correctly. There is another
program which prints the students report and mails it to the address mentioned in the
Student_Report file. What happens to the report of a student whose address is being
changed? There is a mismatch in the actual address and his report is sent to his old
address. This mismatch in different copies of same data is called data inconsistency. This
has occurred here, because there is no proper listing of files which has same copies of
data.
5. Data Isolation: - Imagine we have to generate a single report of student, who is studying
in particular class, his study report, his library book details, and hostel information. All
these informations are stored in different files. How do we get all these details in one
report? We have to write a program. But before writing the program, the programmer
should find out which all files have the information needed, what is the format of each file,
how to search data in each file etc. Once all these analysis is done, he writes a program. If
there is 2-3 files involved, programming would be bit simple. Imagine if there is lot many
files involved in it? It would be require lot of effort from the programmer. Since all the
datas are isolated from each other in different files, programming becomes difficult.
6. Security: - Each file can be password protected. But what if have to give access to only
few records in the file? For example, user has to be given access to view only their bank
account information in the file. This is very difficult in the file system.
7. Integrity: - If we need to check for certain insertion criteria while entering the data into file
it is not possible directly. We can do it writing programs. Say, if we have to restrict the
students above age 18, then it is by means of program alone. There is no direct checking
facility in the file system. Hence these kinds of integrity checks are not easy in file system.
8. Atomicity: - If there is any failure to insert, update or delete in the file system, there is no
mechanism to switch back to the previous state. Imagine marks for one particular subject
needs to be entered into the Report file and then total needs to be calculated. But after
entering the new marks, file is closed without saving. That means, whole of the required
transaction is not performed. Only the totaling of marks has been done, but addition of
marks not being done. The total mark calculated is wrong in this case. Atomicity refers to
completion of whole transaction or not completing it at all. Partial completion of any
transaction leads to incorrect data in the system. File system does not guarantee the
atomicity. It may be possible with complex programs, but introduce for each of transaction
costs money.
9. Concurrent Access: - Accessing the same data from the same file is called concurrent
access. In the file system, concurrent access leads to incorrect data. For example, a
student wants to borrow a book from the library. He searches for the book in the library file
and sees that only one copy is available. At the same time another student also, wants to
borrow same book and checks that one copy available. First student opt for borrow and
gets the book. But it is still not updated to zero copy in the file and the second student also
opt for borrow! But there are no books available. This is the problem of concurrent access
in the file system.

You might also like