ITDA310 - Study Guide (V1.0)
ITDA310 - Study Guide (V1.0)
Systems
ITDA310
Compiled by Nyasha Magutsa
Version 1.0
NQF Level 7
Credit value: 12
INTRODUCTION .......................................................................................................... 1
Module aim ........................................................................................................... 1
Module abstract .................................................................................................... 1
Learning outcomes and assessment criteria ......................................................... 2
Summary of learning outcomes and assessment criteria ...................................... 3
Module content ..................................................................................................... 4
Lectures................................................................................................................ 5
Class exercises and activities ............................................................................... 5
Information resources .......................................................................................... 5
Prescribed textbook ............................................................................................. 6
Recommended information sources ...................................................................... 6
Books ................................................................................................................. 6
Websites ................................................................................................................ 7
Software ............................................................................................................... 7
Using this Study Guide ......................................................................................... 8
Purpose ................................................................................................................ 8
Structure .............................................................................................................. 9
Individual units .................................................................................................... 9
Glossary ............................................................................................................. 10
The use of icons .................................................................................................. 10
Alignment to prescribed textbook ...................................................................... 10
Concluding remarks ............................................................................................ 10
UNIT 1: PRINCIPLES, FUNCTIONS AND APPLICATIONS OF A DDMS ......................... 11
Learning objectives ............................................................................................ 11
Introduction ....................................................................................................... 11
1.1 Entity-Relationship (ER) modelling ......................................................... 11
1.2 Database relations .................................................................................. 21
1.2.1 Properties of relations .................................................................................21
1.2.2 Relational keys ...........................................................................................21
1.2.3 Representing relational database schemas .....................................................22
1.3 Integrity constraints ............................................................................... 22
1.3.1 Null ...........................................................................................................22
1.3.2 Entity integrity ...........................................................................................23
1.3.3 Referential integrity ....................................................................................24
1.3.4 General constraints .....................................................................................26
1.4 Integrity enhancement feature ............................................................... 27
1.4.1 Required data.............................................................................................27
1.4.2 Domain constraints .....................................................................................28
1.4.3 Entity integrity ...........................................................................................29
1.4.4 Referential integrity ....................................................................................29
1.4.5 General constraints .....................................................................................29
1.5 Views, triggers and stored procedures ................................................... 30
1.5.1 Views ........................................................................................................30
1.5.2 Triggers .....................................................................................................31
1.5.3 Stored procedures ......................................................................................33
Concluding remarks ............................................................................................ 35
1.6 Self-assessment ..................................................................................... 35
UNIT 2: DATABASE SECURITY, CONCURRENCY CONTROL AND RECOVERY ................ 37
Learning objectives ............................................................................................ 37
Introduction ....................................................................................................... 37
2.1 Database security ................................................................................... 38
2.1.1 Threat .......................................................................................................38
2.2 Countermeasures: computer-based controls .......................................... 39
2.2.1 Authorisation .............................................................................................40
2.2.2 Access control ............................................................................................40
2.2.3 View .........................................................................................................40
2.2.4 Backup and recovery ...................................................................................40
2.2.5 Journaling ..................................................................................................42
2.2.6 Integrity ....................................................................................................42
2.2.7 Encryption .................................................................................................42
2.2.8 Redundant Array of Independent Disks (RAID) ...............................................42
2.3 Security in MySQL DBMS ......................................................................... 43
2.3.1 Securing a MySQL database using a password ................................................43
2.3.2 Administrative roles ....................................................................................44
2.3.3 Global privileges .........................................................................................44
2.3.4 Setting the Insert, Select and Update privileges .............................................45
2.3.5 Log On dialog box .......................................................................................46
2.4 DBMS and Web security .......................................................................... 46
2.4.1 How Secure Electronic Transactions (SET) works ............................................47
2.5 Concurrency control................................................................................ 47
2.5.1 Potential problems caused by concurrency .....................................................47
2.5.2 Concurrency control techniques ....................................................................49
2.5.3 Timestamping ............................................................................................55
2.5.4 Multiversion timestamp ordering ...................................................................56
2.5.5 Optimistic techniques ..................................................................................56
2.6 Database recovery .................................................................................. 57
2.6.1 The need for recovery .................................................................................57
2.6.2 Transaction and recovery .............................................................................57
2.6.3 Recovery facilities .......................................................................................58
2.6.4 Recovery techniques ...................................................................................61
Concluding remarks ............................................................................................ 62
2.7 Self-assessment ..................................................................................... 62
UNIT 3: INTEGRATE A DATABASE WITH AN APPLICATION ....................................... 65
Learning objectives ............................................................................................ 65
Introduction ....................................................................................................... 65
3.1 Step 1: Build conceptual data model ....................................................... 66
Step 1.1: Identify entity types .................................................................................66
Step 1.2: Identify relationship types .........................................................................66
Step 1.3: Identify and associate attributes with entity or relationship types ...................66
Step 1.4: Determine attribute domains .....................................................................67
Step 1.5: Determine candidate, primary, and alternate key attributes ...........................67
Step 1.6: Consider use of enhanced modelling concepts (optional step) ........................67
Step 1.7: Check model for redundancy ......................................................................67
Step 1.8: Validate conceptual model against user transactions .....................................67
Step 1.9: Review conceptual data model with user .....................................................67
3.2 Step 2: Build and validate logical data model ......................................... 67
Step 2.1: Derive relations for logical data model ........................................................67
Step 2.2: Validate relations using normalisation .........................................................69
Step 2.3: Validate relations against user transactions .................................................69
Step 2.4: Check integrity constraints ........................................................................69
Step 2.5: Review logical data model with user ...........................................................69
Step 2.6: Merge logical data models into global model ................................................69
Step 2.7: Check for future growth ............................................................................70
3.3 Step 3: Translate logical data model for target DBMS ............................. 70
Step 3.1: Design base relations ................................................................................70
Step 3.2: Design representations of derived data .......................................................70
Step 3.3: Design general constraints.........................................................................71
3.4 Step 4: Design file organisations and indexes......................................... 71
Step 4.1: Analyse transactions .................................................................................71
Step 4.2: Choose file organisations ...........................................................................71
Step 4.3: Choose indexes ........................................................................................71
Step 4.4: Estimate disk space requirements...............................................................71
3.5 Step 5: Design user views ....................................................................... 71
3.6 Step 6: Design security mechanisms ....................................................... 71
3.7 Step 7: Consider the introduction of controlled redundancy ................... 72
3.8 Step 8: Monitor and tune the operational system ................................... 72
3.9 Step 9: Build and integrate an application/website ................................ 72
Concluding remarks ............................................................................................ 75
3.10 Self-assessment ..................................................................................... 76
GLOSSARY ................................................................................................................ 77
BIBLIOGRAPHY ........................................................................................................ 85
Introduction Page 1
Introduction
The primary aim of this module is to provide you with a deeper understanding
of relational databases and a basic understanding of distributed databases.
This module only addresses the design and development of a MySQL database.
The student is expected to design and develop an application or website that is
to integrate with the MySQL database as a final deliverable. The application or
website is to be developed using the student‟s language of expertise. The
student is to supplement by researching the application or website‟s
development language as this will not be addressed in this module. This
module demonstrates the integration of MySQL with MS Visual Studio and the
student is to research how the application or website development language of
their choice integrates with MySQL. This research is encouraged earlier in the
module as it is key in the final deliverable of this module.
The following are some of the domain areas that a database and an
application/website can be developed:
Health record management
Student record management
Library record management
Online shopping website
Online reservation website
Module aim
The primary aim of this module is to provide students with an understanding of
relational and distributed databases in relation to applications.
Module abstract
The scope of this module covers practical application of a database system
with third party applications. The outcome is to demonstrate expertise in
The following table outlines the assessment criteria that are aligned to the
learning outcomes.
These outcomes are covered in the module content and they are assessed in
the form of a written assignment and practical. If you comply with and achieve
all the pass criteria related to the outcomes, you will pass this module.
Module content
1. Demonstrate a thorough understanding of principles,
functioning and applications of distributed database
management systems.
Design, develop and build a third party application that interfaces with the
database
Practical - Integration of applications
Practical – develop and build a third party application that integrates
with the database
Lectures
Each week has four compulsory lecture hours for all students. It is
recommended that the lecture hours be divided into two sessions of two
hours each, but this may vary depending on the campus.
Each week has a lecture schedule which indicates the approximate time that
should be allocated to each activity. The week‟s work schedule has also been
divided into two lessons.
Activity sheets that are handed in should be kept by the lecturer so that they
can be used as proof of criteria that were met, if necessary.
Information resources
You should have access to a resource centre or library with a wide range of
relevant resources. Resources can include textbooks, e-books, newspaper
articles, journal articles, organisational publications, databases, etc. You can
access a range of academic journals in electronic format via EBSCOhost. You
will have to ask a campus librarian to assist you with accessing EBSCOhost.
Prescribed textbook
There is no prescribed book for this module.
Castano, S.; Fugini, M.; Martella, G. & Samarati, P. 1995. Database security.
Reading, Mass: Addison-Wesley.
Codd, E.F. 1982. The 1981 ACM Turing award lecture: relational database: a
practical foundation for productivity. Comm. ACM, 25(2):109-117.
Connolly, T.; Begg, C. & Holowczak, R. 2008. Business database systems. New
York: Addison-Wesley.
Davies Jr., J.C. 1973. Recovery semantics for a DB/DC system. Proc. ACM
Annual Conf., 136-141.
Freytag, J. C.; Maier, D. & Vossen, G. 1994. Query processing for advanced
database systems. San Mateo, CA: Morgan Kaufmann.
Gary, J.N. 1981. The transaction concept: virtues and limitations. Proc. Int.
Conf. Very Large Data Bases, 144-154.
Patrick, J. J. 2002. SQL fundamentals. 2nd edition. Upper Saddle River, NJ:
Prentice Hall.
Wertz, C.J. 1993. Relational database design: a practitioner’s guide. New York:
CRC Press.
Websites
http://blog.winhost.com
http://www.abanet.org/scitech/ec/isc/dsg-tutorial.html
https://www.beastnode.com
http://www.computerprivacy.org/who/
http://www.cve.mitre.org
http://dev.mysql.com/doc/refman/5.5/en/index.html
http://www.mysqltutorial.org/
http://tpc.org
Note
Web pages provide access to a further range of Internet information sources.
Lecturers may download the web-related material for students to access offline.
Students must use this resource with care, justifying the use of information gathered.
Software
MySQL 5.6 database software or higher is to be used.
Follow the following steps:
Download MySQL for Windows from http://dev.mysql.com/downloads
Install MySQL with an administrator account:
o Install MySQL on Windows using the Microsoft Software Installer Package:
Download and start the MySQL Installation Wizard
Click Custom Installation > OK
o Install MySQL additional optional components:
ODBC driver
Connecter/net driver
Disable antivirus scanning on the main MySQL data directory (datadir)
MySQL has both a graphics user interface (GUI) and command line interface
(CMD).
The module outline must be read in conjunction with the study guide and
prescribed textbook (if applicable). This document will be the first port of call
in understanding what will be assessed and which assessments form part of
the module.
The purpose of the Study Guide is to facilitate your learning and help you to
master the content of the module. It helps you to structure your learning and
manage your time; provides outcomes and activities to help you master said
outcomes.
The Study Guide has been carefully designed to optimise your study time and
maximise your learning, so that your learning experience is as meaningful and
successful as possible. To deepen your learning and enhance your chances of
success, it is important that you read the Study Guide attentively and follow all
the instructions carefully. Pay special attention to the course outcomes at the
beginning of the Study Guide and at the beginning of each unit.
It is essential that you complete the exercises and other learning activities in
the Study Guide as your course assessments (practical and assignment) will be
based on the assumption that you have completed these activities.
The Study Guide should be read in conjunction with different research sources.
Purpose
The purpose of the Study Guide is to facilitate the learning process and to help
you to structure your learning and to master the content of the module. The
alternative research material covers most areas in detail.
Structure
The Study Guide is structured as follows:
Introduction
Unit 1: Principles, functions and applications of a DDMS
Unit 2: Database security, concurrency control and recovery
Unit 3: Integrate a database with an application
Glossary
Bibliography
Individual units
The individual units in the Study Guide are structured in the same way and
each unit contains the following features, which should enhance your learning
process:
Glossary
As you can see, we include a glossary at the end of the Study Guide. Please
refer to it as often as necessary in order to familiarise yourself with the exact
meaning of terms and concepts involved in Advanced Database Systems.
Definition
This icon appears when definitions of a particular term or concept
are given in the text.
Example
This icon points to a section in the text where relevant examples
for a particular topic (theme) or concept are provided.
Concluding remarks
At this point, you should be familiar with the module design and structure as
well as with the use of the Study Guide.
Learning outcomes:
LO1: Demonstrate a thorough understanding of principles,
functioning and applications of distributed database
management systems
Assessment criteria:
AC1.1: Demonstrate how to use entity-relationship (ER) modelling in
database design as well as the basic concepts associated
with the ER model
AC1.2: Demonstrate how to use relational integrity rules, including
entity and referential integrity in a distributed database
management system
Learning objectives
After studying this unit, you should be able to:
Understand basic entity-relationship model concepts
Understand and design an entity-relationship model
Discuss the relational integrity rules, including entity integrity and
referential integrity
Introduction
This unit introduces the concepts behind the relational model, the most popular
data model at present, and the one most often chosen for standard business
applications. The relational integrity rules, entity integrity, views and
referential integrity are discussed.
Figure 1 cont.
Source: Connolly & Begg (2015:C-2–C-3)
The following diagram, Figure 2, has been derived from the DreamHouse Case
Study from Connelly and Begg (2015: Section 11.4). It shows the Chen
notation.
The scenario depicting the below diagram has been summarised as follows:
A Supervisor who is a Staff member manages a Branch.
o In this relationship, the Branch‟s existence is mandatory in order for the
Supervisor staff to be able to manage.
A Supervisor whose is a Staff member supervises many Supervisees who
are also Staff members.
A Branch has many Staff members.
o In this relationship, the Branch and Staff‟s existence is mandatory in
order for the relationship to exist.
A Staff member registers Clients.
A Branch registers Clients.
o In this relationship, the Branch and Client‟s existence is mandatory in
order for the registration to take place.
A Client states their Preference.
o In this relationship, the Client and Preference‟s existence is mandatory
in order for the relationship to exist.
Figure 3 cont.
Source: Connolly & Begg (2015:C-5)
The following diagram, Figure 4, has been derived from the DreamHouse Case
Study from Connelly & Begg (2015: Section 11.4). It shows the equivalent
Crow‟s foot notation of Figure 2.
The following has to be noted when constructing an ERD using Crow‟s foot
notation:
The name of entity is normally a singular noun.
The first letter of entity should be uppercase.
The relationship name should describe its function, preferably a verb or a
short phrase including a verb.
Each word in a relationship name‟s first letter should be capitalised.
The relationship should be named in one direction making sense.
The following diagram, Figure 5, has been derived from the DreamHouse Case
Study from Connelly & Begg (2015: Section 11.4). It shows the equivalent
Crow‟s foot notation of Figure 2.
The definitions below are according to Connolly and Begg (2015: Section 12)
and Chen (1976). The DreamHouse Case Study from Connelly and Begg
(2015: Section 11.4) can be referred to for all the below concepts.
An entity type is a group of objects with the same properties, which are
identified by the enterprise as having an independent existence. An entity
occurrence is a uniquely identifiable object of an entity type.
o For example in Figure 5, entities can be of entity type Physical existence
– Supervisor, Client, Owner.
Definition
We can define a relation schema as a named relation defined by
a set of attribute and domain name pairs. Relational database
schema is a set of relation schemas, each with a distinct name.
(Connolly & Begg, 2015:156).
Super key
An attribute or set of attributes that uniquely identifies a tuple within a
relation
Candidate key
Super key (K), such that no proper subset is a super key within the relation
In each tuple of R, values of K uniquely identify that tuple (uniqueness)
No proper subset of K has the uniqueness property (irreducibility)
Primary key
The candidate key selected to identify tuples uniquely within the relation
Alternate keys
Candidate keys that are not selected to be the primary key
Foreign key
An attribute or set of attributes within one relation that matches a
candidate key of some (possibly same) relation
The two principal rules for the relational model are known as entity integrity
and referential integrity. Other types of integrity constraint are multiplicity and
general constraints. Before we define entity and referential integrity, it is
necessary to understand the concept of nulls (Connolly & Begg, 2015:161).
1.3.1 Null
This represents value for an attribute that is currently unknown or not
applicable for tuple, deals with incomplete or exceptional data and also
represents the absence of a value and is not the same as zero or spaces, which
are values (Connolly & Begg, 2015:161).
Definition
The first integrity rule applies to the primary keys of base
relations. For the moment, we will define a base relation as a
relation that corresponds to an entity in the conceptual schema
(Connolly & Begg, 2015:162).
Entity integrity states that, in a base relation, no attribute of a primary key can
be null. By definition a primary key is a minimal identifier that is used to
identify tuples uniquely. This means that no subset of the primary key is
sufficient to provide unique identification of tuples. If we allow a null for any
part of a primary key, we are implying that not all the attributes are needed to
distinguish between tuples, which contradicts the definition of the primary key
(Connolly & Begg, 2015:162).
MySQL will require the field country_id to be defined and linked as a foreign
key in the table City in order for the system to implement this Referential
integrity constraint.
Below is how the foreign key is defined in the City table. The table where is
resides as a primary key – Country is referenced, the corresponding fields to
be linked are selected and the action on update or delete is specified.
These constraints can be defined in the CREATE and ALTER TABLE statements.
CHECK (searchCondition)
However, the ISO standard allows domains to be defined more explicitly using
the CREATE DOMAIN statement:
PRIMARY KEY(staffNo)
PRIMARY KEY(clientNo, propertyNo)
There can only be one PRIMARY KEY clause per table. However, it is still
possible to ensure uniqueness for alternate keys using UNIQUE:
UNIQUE(telNo)
(Connolly & Begg, 2015:241–242)
This can be used to enforce certain business rules such as filtering information
relevant only to specific users. Such information that is classified as
confidential can be made available to authorised users through the use of
Views.
Its benefits according to Connelly & Begg (2015: Section 4.4) include the
following:
Views provides a level of security
Views provide a mechanism to customise the appearance of the database
Views can present a consistent, unchanging picture of the structure of the
database
Useful website
For additional resources on Views, access the following link.
http://www.mysqltutorial.org/mysql-views-tutorial.aspx
1.5.2 Triggers
A trigger has the ability to enforce specified business rules as it is applicable to
certain events occurring in a table.
It defines an action that the database should take when a certain event occurs.
It can be used to enforce referential integrity constraints or audit data
changes.
Cascading effects
Cannot be scheduled
Less portable
Useful website
For additional resources on Triggers, access the following link.
http://www.mysqltutorial.org/mysql-triggers.aspx
Highlighted below are the parameters to be input and output from the stored
procedure.
Useful website
For additional resources on Stored Procedures, access the
following link.
http://www.mysqltutorial.org/mysql-stored-procedure-
tutorial.aspx
Concluding remarks
In this unit we discussed the relational integrity rules, such as entity and
referential integrity rules. Also a practical insight on the implementation of
these concepts is highlighted.
1.6 Self-assessment
1. Use MySQL to create the DreamHome database. Use the relational schema
of the DreamHome case study (Connolly & Begg, 2015) specified in Section
17.1 – Figure 17.3.
Learning outcomes:
LO2: Compare and contrast database recovery, concurrency
control, security and data integrity measures for centralised
and distributed databases
Assessment criteria:
AC2.1: Discuss the scope of database security; compare and
contrast the types of threat that may affect a distributed
database system
AC2.2: Compare and contrast a range of computer-based controls
that are available as countermeasures to such threats.
AC2.3: Compare and contrast security measures associated with
database systems and the web.
AC2.4: Compare and contrast concurrency controls and examine
the protocols that can be used to prevent conflict
AC2.5: Compare and contrast database recovery options and
examine the techniques that can be used to ensure a
distributed database remains in a consistent state in the
presence of failures
Learning objectives
After studying this unit, you should be able to:
Discuss the scope of database security.
Identify computer-based threats and their countermeasures.
Identify the security measures associated with DBMS and the Web.
Distinguish concurrent controls and protocols that prevent conflict.
Distinguish database recovery options that can be used in a failure state.
Introduction
This unit considers database security and recovery. Security considers both the
DBMS and its environment. It illustrates security provision with MySQL. The
unit also examines the security problems that can arise in a Web environment
and presents some approaches to overcoming them.
2.1.1 Threat
This refers to any situation or event, whether intentional or unintentional, that
will adversely affect a system and consequently an organisation (Connolly &
Begg, 2015:609).
2.2.1 Authorisation
It is the granting of a right or privilege, which enables a subject to legitimately
have access to a system or a system‟s object. Authorisation is a mechanism
that determines whether a user is who he or she claims to be (Connolly &
Begg, 2015:612).
DAC determines whether a user can read or write an object based on rules that
involve the security level of the object and the clearance of the user. These
rules ensure that sensitive data can never be „passed on‟ to another user
without the necessary clearance. The SQL standard does not include support
for MAC (Connolly & Begg, 2015:614).
2.2.3 View
A view is the dynamic result of one or more relational operations operating on
the base relations to produce another relation. A view is a virtual relation that
does not actually exist in the database, but is produced upon request by a
particular user, at the time of request (Connolly & Begg, 2015:616).
Backup and recovery can be handled using two options. The graphic user
interface and the command line.
Useful website
For additional resources on Graphic user interface backup or
restoration, access the following link.
https://www.beastnode.com/portal/knowledgebase/48/MySQL-
Workbench-Backup-and-Import-your-MySQL-Database.html
Useful website
For additional resources on Command Line backup or
restoration, access the following links.
http://dev.mysql.com/doc/refman/5.5/en/mysqldump.html
http://blog.winhost.com/using-mysqldump-to-backup-and-
restore-your-mysql-databasetables/
2.2.5 Journaling
This is the process of keeping and maintaining a log file (or journal) of all
changes made to a database to enable effective recovery in the event of
failure. A DBMS should provide logging facilities, sometimes referred to as
journaling, which keep track of the current state of transactions and database
changes, to provide support for recovery procedures. The advantage of
journaling is that in the event of a failure, the database can be recovered to its
known consistent state using a backup copy of the database and the
information contained in the log file (Connolly & Begg, 2015:617).
2.2.6 Integrity
This prevents data from becoming invalid, and hence giving misleading or
incorrect results (Connolly & Begg, 2015:617).
2.2.7 Encryption
This refers to the encoding of the data by a special algorithm that renders the
data unreadable by any program without the decryption key. Encryption also
protects data transmitted over communication lines. There a number of
techniques for encoding data to conceal the information; some are termed
„irreversible‟ and others „reversible‟ (Connolly & Begg, 2015:617–618).
Irreversible techniques, as the name implies, do not permit the original data to
be known. However, the data can be used to obtain valid statistical
information. Reversible techniques are more commonly used. To transmit data
securely over insecure networks requires the use of a cryptosystem, which
includes:
An encryption key to encrypt the data (plaintext)
An encryption algorithm that, with the encryption key, transforms the
plaintext into ciphertext
A decryption key to decrypt ciphertext
A decryption algorithm that, together with the decryption key, transforms
the ciphertext back into plaintext
One technique, called symmetric encryption, uses the same key for both
encryption and decryption, and relies on safe communication lines for
exchanging the key. Another type of cryptosystem uses different keys for
encryption and decryption, and is referred to as asymmetric encryption
(Connolly & Begg, 2015:617–618).
According to Connolly and Begg (2015: Section 20.2.7) and Chen, Lee, Gibson,
Katz and Patterson (1994), there are a number of different disk configurations,
called RAID levels:
RAID 0 Non-redundant
RAID 1 Mirrored
RAID 0+1 Non-redundant and mirrored
RAID 2 Memory-style error-correcting codes
RAID 3 Bit-interleaved parity
RAID 4 Block-interleaved parity
RAID 5 Block-interleaved distributed parity
RAID 6 P+Q redundancy
Connolly and Begg (2015:627) state that while transmitting information over
the Internet, one must ensure that:
It is inaccessible to anyone but the sender and receiver (privacy)
It is not changed during transmission (integrity)
The receiver can be sure that it came from the sender (authenticity)
The sender can be sure that the receiver is genuine (non-fabrication)
The sender cannot deny that he or she sent it (non-repudiation)
Loss of T2‟s update avoided by preventing T1 from reading balx until after
update.
Problem avoided by preventing T6 from reading balx and balz until after T5
has completed updates.
Connolly and Begg (2015: Section 22.2.3) outline some basic rules of locking,
as follows:
If the transaction has a shared lock on an item, one can read but not
update the item.
If the transaction has an exclusive lock on an item, one can both read and
update the item.
Reads cannot conflict, so more than one transaction can hold shared locks
simultaneously on the same item.
An exclusive lock gives a transaction exclusive access to that item.
Some systems allow the transaction to upgrade read lock to an exclusive
lock, or downgrade exclusive lock to a shared lock.
Two phases for transaction, according to Connolly and Begg (2015: Section
22.2.3), are:
Growing phase: Acquires all locks but cannot release any locks.
Shrinking phase: Releases locks but cannot acquire any new locks.
Cascading rollback
If every transaction in a schedule follows 2PL, the schedule is serialisable.
However, problems can occur with the interpretation of when the locks can be
released (Connolly & Begg, 2015: Section 22.2.3).
Transactions conform to 2PL. T14 aborts. Since T15 is dependent on T14, T15
must also be rolled back. Since T16 is dependent on T15, it, too, must be
rolled back. This is called cascading rollback. To prevent this with 2PL, leave
the release of all the locks until the end of the transaction (Connolly & Begg,
2015: Section 22.2.3).
When a new index value (key and pointer) is being inserted into a leaf node,
then if the node is not full, insertion will not cause changes to higher-level
nodes (Connolly & Begg, 2015: Section 22.2.3).
This suggests that we only have to exclusively lock the leaf node in such a
case, and only exclusively lock higher-level nodes if the node is full and has to
be split (Connolly & Begg, 2015: Section 22.2.3).
Thus, Connolly and Begg (2015: Section 22.2.3) derive the following locking
strategy:
For searches, obtain shared locks on nodes, starting at the root and
proceeding downwards along the required path. Release the lock on the
node once a lock has been obtained on the child node.
For insertions, the conservative approach would be to obtain exclusive locks
on all nodes as we descend through tree to the leaf node to be modified.
For a more optimistic approach, obtain shared locks on all nodes as we
descend to the leaf node to be modified, where we obtain an exclusive lock.
If the leaf node has to split, upgrade the shared lock on the parent to an
exclusive lock. If this node also has to split, continue to upgrade locks at
the next higher level.
Deadlock
This is an impasse that may result when two (or more) transactions are each
waiting for locks held by the other to be released (Connolly & Begg, 2015:
Section 22.2.4).
There is only one way to break a deadlock; that is: abort one or more of the
transactions. A deadlock should be transparent to the user, so DBMS should
restart the transaction/s (Connolly & Begg, 2015: Section 22.2.4).
Timeouts
A transaction that requests a lock will only wait for a system-defined period
of time. If the lock has not been granted within this period, the lock request
times out. In this case, DBMS assumes that the transaction may be
deadlocked, even though it may not be, and it aborts and automatically
restarts the transaction (Connolly & Begg, 2015: Section 22.2.4).
Deadlock prevention
According to Connolly and Begg (2015: Section 22.2.4), in this technique,
DBMS looks ahead to see if the transaction would cause deadlock, and
never allows a deadlock to occur. One could order transactions using
transaction timestamps:
o Wait-Die: only an older transaction can wait for a younger one;
otherwise the transaction is aborted (dies) and restarted with the same
time stamp.
o Wound-Wait: only a younger transaction can wait for an older one. If
the older transaction requests a lock held by a younger one, the
younger one is aborted (wounded).
A deadlock exists if, and only if, the WFG contains a cycle. A WFG is created
at regular intervals.
The following are some issues to consider when recovering from deadlock
detection:
o The choice of a deadlock victim
o How far to roll a transaction back
o Avoiding starvation
(Connolly & Begg, 2015: Section 22.2.4)
2.5.3 Timestamping
Transactions are ordered globally so that older transactions and transactions
with smaller timestamps, receive priority in the event of conflict. Conflict is
resolved by rolling back and restarting the transaction (Connolly & Begg, 2015:
Section 22.2.5).
Connolly and Begg (2015: Section 22.2.5) state that read/write proceeds only
if the last update on that data item was carried out by an older transaction.
Otherwise, the transaction requesting read/write is restarted and given a new
timestamp. Also, timestamps for data items can be:
Read-timestamp: timestamp of last transaction to read item
Write-timestamp: timestamp of last transaction to write item
Connolly and Begg (2015: Section 22.3.1) talk about the different types of
failure that can affect database processing. Among the causes of failure are:
System crashes, resulting in loss of main memory
Media failures, resulting in loss of parts of secondary storage
Application software errors
Natural physical disasters
Carelessness or unintentional destruction of data or facilities
Sabotage
If the transaction had not committed at failure time, the recovery manager has
to undo (rollback) any effects of that transaction for atomicity. Partial undo
takes place when only one transaction has to be undone. Global undo takes
place when all transactions have to be undone (Connolly & Begg, 2015:
Section 22.3.2).
Example
DBMS starts at time t0, but fails at time tf. Assume that the data for
transactions T2 and T3 has been written to secondary storage. T1 and T6 have
to be undone. In the absence of any other information, the recovery manager
has to redo T2, T3, T4, and T5.
An identifier of the data item affected by the database action (insert, delete
and update operations)
A before-image of the data item
An after-image of the data item
Log management information
The log file may be duplexed or triplexed. The log file is sometimes split into
two, separate random-access files. This might cause potential bottleneck
issues, and is critical in determining overall performance (Connolly & Begg,
2015: Section 22.3.3).
MySQL creates several log file that store different information with regards to
the activities occurring in the database. These log files have to be activated
and cleared frequently as they consume much space. Below are the different
types of MySQL log files.
The above log files can be activated in MySQL Workbench in the Management
screen.
Useful website
For more details access the following link:
https://dev.mysql.com/doc/refman/5.0/en/server-logs.html
2.6.3.2 Checkpointing
It is the point of synchronisation between the database and log file. All buffers
are force-written to secondary storage. A checkpoint record is created,
containing identifiers of all active transactions. When failure occurs, redo all
transactions that committed since the checkpoint and undo all transactions
active at the time of the crash (Connolly & Begg, 2015: Section 22.3.3).
In the previous example, with checkpoint at time tc, changes made by T2 and
T3 have been written to secondary storage. Thus, only redo T4 and T5, and
undo transactions T1 and T6 (Connolly & Begg, 2010).
the clustered index that organises the data to minimise I/O for primary key
lookups.
To maintain data integrity, InnoDB also supports FOREIGN KEY referential-
integrity constraints.
You can freely mix InnoDB tables with tables from other MySQL storage
engines, even within the same statement. For example, you can use a join
operation to combine data from InnoDB and MEMORY tables in a single
query.
InnoDB has been designed for CPU efficiency and maximum performance
when processing large data volumes.
Source: http://dev.mysql.com/doc/refman/5.0/en/innodb-storage-engine.html
[Accessed: 03/05/2015]
Useful website
For additional resources on MySQL Checkpointing, access the
following links.
https://dev.mysql.com/doc/refman/5.5/en/innodb-
checkpoints.html
https://dev.mysql.com/doc/refman/5.5/en/innodb-storage-
engine.html
essential that log records are written before writing to the database, using the
Write-ahead log protocol. If there is no „transaction commit‟ record in the log,
then that transaction was active at failure and must be undone. Undo
operations are performed in reverse order to the order in which they were
written to the log (Connolly & Begg, 2015: Section 22.3.4).
Concluding remarks
In this unit, we examined concurrency control, database recovery, protocols
that can prevent conflicts in databases and the techniques used to ensure that
a database remains consistent in the event of failures.
Also the scope of database security as well as the different types of computer
threats was discussed. In addition to that, various security measures
associated with DBMS and the Web were discussed.
2.7 Self-assessment
6. Describe, with examples, the types of problems that can occur in a multi-
user environment when concurrent access to the database is allowed.
7. What is a timestamp? How do timestamp-based protocols for concurrency
control differ from locking-based protocols?
8. Discuss the difference between pessimistic and optimistic concurrency
control.
9. On the MySQL database you have created, enable the expire_logs_days in
on the database‟s log file and set the value to 7 days.
10. Checkpointing in MySQL is implemented using the InnoDB. On the MySQL
database you have created, change the InnoDB log file size to 32M.
11. Review and research the following article:
https://dev.mysql.com/doc/refman/5.5/en/storage-engines.html.
Learning outcomes:
LO3: Integrate a database with a software application or website.
Assessment criteria:
AC3.1: Produce an optimised logical and physical design for a
database of advanced complexity.
AC3.2: Develop and build the relational database.
AC3.3: Design, develop and build a third part application that
interfaces with the database.
Learning objectives
After studying this unit, you should be able to:
Design a logical and physical database design
Develop and build a relational database
Build and integrate a database to an application
Introduction
This unit considers the database design and integration to an application or
website. The application or website is assumed to be developed in a language
of your own choice. Note that the application or website design and
development is not covered in this module and is to be developed using
expertise gained from other languages taught in other modules. Also research
is emphasised for application or website design and development if no prior
experience is available.
relations and foreign key attributes. Also, document any new primary or
alternate keys that have been formed as a result of the process of deriving
relations (Connolly & Begg, 2015: Appendix D).
centralised approach then Step 2.6 is omitted. If, however, the database has
multiple user views that are being managed using the view integration
approach, then Steps 2.1 to 2.5 are repeated for the required number of data
models, each of which represents different user views of the database system.
In Step 2.6 these data models are merged. Typical tasks associated with the
process of merging are as follows:
1. Review the names and contents of entities/relations and their candidate
keys.
2. Review the names and contents of relationships/foreign keys.
3. Merge entities/relations from the local data models.
4. Include (without merging) entities/relations unique to each local data
model.
5. Merge relationships/foreign keys from the local data models.
6. Include (without merging) relationships/foreign keys unique to each local
data model.
7. Check for missing entities/relations and relationships/foreign keys.
8. Check foreign keys.
9. Check integrity constraints.
10. Draw the global ER/relation diagram.
11. Update the documentation. Validate the relations created from the global
logical data model using the technique of normalisation and ensure that
they support the required transactions, if necessary.
(Connolly & Begg, 2015: Appendix D)
Concluding remarks
The database design process is composed of three main phases: conceptual,
logical and physical database design. After a database is built, it can be
integrated with application system.
3.10 Self-assessment
Glossary
Access An example of a database development tool, produced by Microsoft
Add SQL keyword: command to add items to a composite object
An inference rule for functional dependencies. Also called the Union
Additive rule Rule. If X Y and X Z then X YZ. Contrast with the projective
rule
An SQL query that returns some function of a collection of rows, rather
Aggregate query
than the rows themselves. See count, avg, min, max, sum
Alter SQL keyword: command to change an object specification
SQL keyword: used as a Boolean operator to construct the where clause
And
of a query
Anomaly An inconsistency in a database
SQL keyword. Used with nested queries to compare an
Any attribute/attributes with the rows returned by the nested query,
returning true if at least one row satisfies the comparison
SQL keyword. An optional keyword used to specify aliases for attributes
As
or tables in a query
Atomic data types The most simple forms of data, Boolean, string, integer, and so forth
The property of a transaction that guarantees that either all or none of
Atomicity
the changes made by the transaction are written to the database
The name of a column of a table, indicating the meaning of the data in
Attribute
that column
Authorisation SQL keyword used to specify the owner of a schema
SQL keyword. Used to perform an aggregate query that returns the
Avg
average value of an attribute
A tree-based structure used for storing data in a database, with extra
B+ tree
links to facilitate sequential access to data
A relation stored in a database, with data that is not calculated from
Base relation
data in other relations
Candidate key A synonym for key, in a relation schema with more than one key
Cardinality The number of elements of the relation (that is, rows in the table)
Set of all ordered pairs (or triples, etc.) of elements from two (or three,
etc.) sets. A natural join of two relations with no common attributes,
Cartesian Product
whose tuples are all possible combinations of the tuples of the original
relations
SQL keyword. Indicates behaviour of the RDBMS when an object is
Cascade modified (deleted or changed) when there are other objects dependent
on it
SQL keyword. Used to indicate that a „check option‟ added to a view
Cascaded
should also apply to the views used to define it (if any)
Char SQL keyword. Synonym for character
Character SQL keyword: Domain of textual data
SQL keyword. Used in the creation of tables to specify complex
Check
conditions the data must satisfy
SQL keyword. Used to ensure that modifications made to a view result
Check option
in rows that still belong to the view
SQL keyword: used in add, alter or drop to denote a column (attribute)
Column
of a table
The action that causes the all of the changes made by a particular
Commit
transaction to be reliably written to the database files
Composite The „whole‟ in a whole-part relationship (aggregation)
Any data model in which data is described independently of the logical
Conceptual data
model used to organise the data, instead relating the data to real-world
models
concept
Concurrency The capacity of a system to handle many users simultaneously
The fact that an SQL query returns whole blocks of data, but high-level
Impedance general-purpose languages generally can only handle single items of
mismatch data one at a time – and the problem of using the two approaches
together.
SQL keyword. Used with nested queries. Attribute in (Query) is
In
equivalent to Attr = any (Query)
A structure (usually a tree structure) allowing quick access to data
Index
stored in the database via a key
SQL keyword. Used to create or drop an index for a database. Syntax:
Index „create [unique] index IndexName onTableName(AttributeList)‟ or „drop
index IndexName‟
Inner join SQL keyword: used to join two tables before selecting from them
SQL keyword. Denotes the privilege of being able to add data to a table
Insert
or view
Insert into SQL keyword. Used to add data to a table
Integer SQL keyword. Domain of integers
A property that must be satisfied by all correct database instances. See
Integrity constraint
Predicate, intra-relational constraint, inter-relational constraint
SQL keyword. Used to find the intersection of the output of two select
Intersect
statements (queries)
The intersection of two relations r1 and r2 is the set of tuples belonging
Intersection
to both r1 and r2 (contrast with union or difference)
Interval SQL keyword, domain of time intervals
SQL keyword. Part of the syntax of „fetch‟. SQL keyword. Part of the
Into
syntax of „execute‟
SQL keyword: used in the where clause of a query: „is null‟ or „is not
Is
null‟
The property of a transaction that guarantees that the changes made by
Isolation a transaction are isolated from the rest of the system until after the
transaction has committed
Join SQL keyword: used to join two tables before selecting from them
A minimal superkey. That is, a set of attributes A on a relation r, such
that there is no two distinct tuples t1 and t2 of r with t1[A] = t2[A]…
Key
and A does not contain any proper subset for which this statement still
holds
An (intra-relational) integrity constraint ensuring that a selected set of
Key constraint
attributes forms a (super)key
An outer join r1 LEFT r2 where dangling tuples from r1 are padded
Left outer join
with blanks and inserted into the join
A method for safely protecting data from being changed by two or more
Locking
users (processes/threads) at the same time
Any data model where a particular method of organisation is used to
Logical data model
organise data
A description of a database according to the appropriate logical data
Logical Schema
model
Metadata Data about the structure of data
Microsoft Access An example of a database development tool, produced by Microsoft
SQL keyword. Used to perform an aggregate query that returns the
Min
minimum value of an attribute
Characteristic of a role. It indicates how many objects of one type fulfil
Multiplicity
the role for the object at the other end of the association
An operator combining tuples of two relations r1 and r2 on sets of
Natural join
attributes X1 and X2
A „join method‟ where the attributes of one table are looped through
Nested loop
once for each tuple in the other
A select statement (SQL query) used as part of the with clause of
Nested query
another query, and used as a source of data against which to compare
attributes
The process of changing a database design to comply with the various
Normalisation
normal forms
Normalisation An algorithm for taking an unnormalised relation and putting it into a
algorithm higher normal form
SQL keyword. Used with nested queries. Not exists (Query) returns true
Not exists
if Query returns no rows at all
SQL keyword. Used with nested queries. Attr not in (Query) is
Not in
equivalent to Attr <> all (Query)
Not null SQL keyword. Constraint that the given attribute may not be null
Null SQL keyword indicating a null value
A special value a tuple can assume on an attribute, denoting an absence
Null value
of information
SQL keyword: domain of exact numbers, either integral or with a given
Numeric
number of decimal places
A computing programming paradigm which defines the computing
Object-oriented problem to be solved as a set of objects which are members of various
object classes each with its own set of data manipulation methods
Order by SQL keyword: Used to sort the output of a query
A natural join augmented with tuples derived from tuples of r1 or r2 for
Outer join
which no matching tuple in the other relation exists
Used to group together classes which are similar or related in some
Package
way, to ease the software development process
Persistent The lifespan of a database extends beyond that of the program using it
A memory-saving technique of performing several operations tuple by
Pipelining
tuple, and so not storing intermediate tables
PL/SQL An extension of SQL marketed by Oracle
A function associating a value True or False with an instance of a
Predicate
database
A key (in the second sense) that is constrained to not contain null
Primary key
values
Privilege A „permission‟ to do something on some component of a database
SQL keyword. Used to define a procedure. Under standard SQL, a
Procedure procedure may only contain a single SQL statement. Many DBMSs relax
this restriction
An operator that takes a relation and returns a new relation whose
Projection
attributes are a subset of the original
A function mapping instances of a given database schema into relations
Query
on a given set of attributes
Query Language A language in which queries may be expressed
Redundant Array of Inexpensive Disks. Hard-drive system with multiple
RAID
disks allowing for various levels of reliability and recoverability
References SQL keyword: specifying a referential constraint
Referential A constraint ensuring, for a set of attributes A of a relation r1, and a
constraint (also: corresponding set of attributes B of r2, and is a key (the primary key?)
foreign key for r2, that for every tuple t1 of r1, there exists a tuple t2 of r2 for
constraint) which t1[A] = t2[B]
Relation A subset of a Cartesian Product
Relation instance A relation, in the second sense
Relational data
A data model using tables (relations) to organise data
model
The name of the relation R, and a set X of names of the attributes.
Relation schema
Normally denoted R(X)
SQL keyword. Used in a „fetch‟ statement to move a given number of
Relative
rows forwards or backwards in the query
Revoke SQL keyword. Used to remove privileges from users
Right outer join An outer join r1 RIGHT r2 where dangling tuples from r2 are padded
Bibliography
Barghouti, N.S. & Kaiser, G. 1991. Concurrency control in advanced database
applications. New York: ACM Computing Survey.
Castano, S.; Fugini, M.; Martella, G. & Samarati, P. 1995. Database security.
Reading, Mass.: Addison-Wesley.
Chen, P.M. & Patterson, D.A. 1990. Maximising performance in a striped disk
array. In: Proceedings of 17th annual international symposium on computer
architecture.
Chen, P.M.; Lee, E.K.; Gibson, G.A.; Katz, R.H. & Patterson, D.A. 1994. RAID:
high-performance, reliable secondary storage. ACM computing surveys, 26(2).
http://blog.winhost.com
http://dev.mysql.com/doc/refman/5.5/en/index.html
http://tpc.org
http://www.abanet.org/scitech/ec/isc/dsg-tutorial.html
http://www.computerprivacy.org/who/
http://www.cve.mitre.org
http://www.mysqltutorial.org/
https://www.beastnode.com