0% found this document useful (0 votes)
28 views95 pages

ITDA310 - Study Guide (V1.0)

Uploaded by

Ewald Steenkamp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views95 pages

ITDA310 - Study Guide (V1.0)

Uploaded by

Ewald Steenkamp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 95

Advanced Database

Systems
ITDA310
Compiled by Nyasha Magutsa

Quality assured by Ndai Mapaso

Edited by Isobel Coetzee

Version 1.0

NQF Level 7

Credit value: 12

 January 2017 CTI EDUCATION GROUP


TABLE OF CONTENTS

INTRODUCTION .......................................................................................................... 1
Module aim ........................................................................................................... 1
Module abstract .................................................................................................... 1
Learning outcomes and assessment criteria ......................................................... 2
Summary of learning outcomes and assessment criteria ...................................... 3
Module content ..................................................................................................... 4
Lectures................................................................................................................ 5
Class exercises and activities ............................................................................... 5
Information resources .......................................................................................... 5
Prescribed textbook ............................................................................................. 6
Recommended information sources ...................................................................... 6
Books ................................................................................................................. 6
Websites ................................................................................................................ 7
Software ............................................................................................................... 7
Using this Study Guide ......................................................................................... 8
Purpose ................................................................................................................ 8
Structure .............................................................................................................. 9
Individual units .................................................................................................... 9
Glossary ............................................................................................................. 10
The use of icons .................................................................................................. 10
Alignment to prescribed textbook ...................................................................... 10
Concluding remarks ............................................................................................ 10
UNIT 1: PRINCIPLES, FUNCTIONS AND APPLICATIONS OF A DDMS ......................... 11
Learning objectives ............................................................................................ 11
Introduction ....................................................................................................... 11
1.1 Entity-Relationship (ER) modelling ......................................................... 11
1.2 Database relations .................................................................................. 21
1.2.1 Properties of relations .................................................................................21
1.2.2 Relational keys ...........................................................................................21
1.2.3 Representing relational database schemas .....................................................22
1.3 Integrity constraints ............................................................................... 22
1.3.1 Null ...........................................................................................................22
1.3.2 Entity integrity ...........................................................................................23
1.3.3 Referential integrity ....................................................................................24
1.3.4 General constraints .....................................................................................26
1.4 Integrity enhancement feature ............................................................... 27
1.4.1 Required data.............................................................................................27
1.4.2 Domain constraints .....................................................................................28
1.4.3 Entity integrity ...........................................................................................29
1.4.4 Referential integrity ....................................................................................29
1.4.5 General constraints .....................................................................................29
1.5 Views, triggers and stored procedures ................................................... 30
1.5.1 Views ........................................................................................................30
1.5.2 Triggers .....................................................................................................31
1.5.3 Stored procedures ......................................................................................33
Concluding remarks ............................................................................................ 35
1.6 Self-assessment ..................................................................................... 35
UNIT 2: DATABASE SECURITY, CONCURRENCY CONTROL AND RECOVERY ................ 37
Learning objectives ............................................................................................ 37
Introduction ....................................................................................................... 37
2.1 Database security ................................................................................... 38
2.1.1 Threat .......................................................................................................38
2.2 Countermeasures: computer-based controls .......................................... 39
2.2.1 Authorisation .............................................................................................40
2.2.2 Access control ............................................................................................40
2.2.3 View .........................................................................................................40
2.2.4 Backup and recovery ...................................................................................40
2.2.5 Journaling ..................................................................................................42
2.2.6 Integrity ....................................................................................................42
2.2.7 Encryption .................................................................................................42
2.2.8 Redundant Array of Independent Disks (RAID) ...............................................42
2.3 Security in MySQL DBMS ......................................................................... 43
2.3.1 Securing a MySQL database using a password ................................................43
2.3.2 Administrative roles ....................................................................................44
2.3.3 Global privileges .........................................................................................44
2.3.4 Setting the Insert, Select and Update privileges .............................................45
2.3.5 Log On dialog box .......................................................................................46
2.4 DBMS and Web security .......................................................................... 46
2.4.1 How Secure Electronic Transactions (SET) works ............................................47
2.5 Concurrency control................................................................................ 47
2.5.1 Potential problems caused by concurrency .....................................................47
2.5.2 Concurrency control techniques ....................................................................49
2.5.3 Timestamping ............................................................................................55
2.5.4 Multiversion timestamp ordering ...................................................................56
2.5.5 Optimistic techniques ..................................................................................56
2.6 Database recovery .................................................................................. 57
2.6.1 The need for recovery .................................................................................57
2.6.2 Transaction and recovery .............................................................................57
2.6.3 Recovery facilities .......................................................................................58
2.6.4 Recovery techniques ...................................................................................61
Concluding remarks ............................................................................................ 62
2.7 Self-assessment ..................................................................................... 62
UNIT 3: INTEGRATE A DATABASE WITH AN APPLICATION ....................................... 65
Learning objectives ............................................................................................ 65
Introduction ....................................................................................................... 65
3.1 Step 1: Build conceptual data model ....................................................... 66
Step 1.1: Identify entity types .................................................................................66
Step 1.2: Identify relationship types .........................................................................66
Step 1.3: Identify and associate attributes with entity or relationship types ...................66
Step 1.4: Determine attribute domains .....................................................................67
Step 1.5: Determine candidate, primary, and alternate key attributes ...........................67
Step 1.6: Consider use of enhanced modelling concepts (optional step) ........................67
Step 1.7: Check model for redundancy ......................................................................67
Step 1.8: Validate conceptual model against user transactions .....................................67
Step 1.9: Review conceptual data model with user .....................................................67
3.2 Step 2: Build and validate logical data model ......................................... 67
Step 2.1: Derive relations for logical data model ........................................................67
Step 2.2: Validate relations using normalisation .........................................................69
Step 2.3: Validate relations against user transactions .................................................69
Step 2.4: Check integrity constraints ........................................................................69
Step 2.5: Review logical data model with user ...........................................................69
Step 2.6: Merge logical data models into global model ................................................69
Step 2.7: Check for future growth ............................................................................70
3.3 Step 3: Translate logical data model for target DBMS ............................. 70
Step 3.1: Design base relations ................................................................................70
Step 3.2: Design representations of derived data .......................................................70
Step 3.3: Design general constraints.........................................................................71
3.4 Step 4: Design file organisations and indexes......................................... 71
Step 4.1: Analyse transactions .................................................................................71
Step 4.2: Choose file organisations ...........................................................................71
Step 4.3: Choose indexes ........................................................................................71
Step 4.4: Estimate disk space requirements...............................................................71
3.5 Step 5: Design user views ....................................................................... 71
3.6 Step 6: Design security mechanisms ....................................................... 71
3.7 Step 7: Consider the introduction of controlled redundancy ................... 72
3.8 Step 8: Monitor and tune the operational system ................................... 72
3.9 Step 9: Build and integrate an application/website ................................ 72
Concluding remarks ............................................................................................ 75
3.10 Self-assessment ..................................................................................... 76
GLOSSARY ................................................................................................................ 77
BIBLIOGRAPHY ........................................................................................................ 85
Introduction Page 1

Introduction
The primary aim of this module is to provide you with a deeper understanding
of relational databases and a basic understanding of distributed databases.

A database approach arose because:


 The definition of data was embedded in application programs, rather than
stored separately and independently.
 There was no control over access and manipulation of data beyond that
imposed by application programs.

The solution to this problem was a database management system (DBMS).


It contains a database as the centre of operation. A database is a shared
collection of logically related data (and a description of this data), designed to
meet the information needs of an organisation. System catalogue (metadata)
provides a description of data to enable program-data independence. Logically-
related data comprises entities, attributes and relationships of an
organisation‟s information (Connolly & Begg, 2015:63).

This module only addresses the design and development of a MySQL database.
The student is expected to design and develop an application or website that is
to integrate with the MySQL database as a final deliverable. The application or
website is to be developed using the student‟s language of expertise. The
student is to supplement by researching the application or website‟s
development language as this will not be addressed in this module. This
module demonstrates the integration of MySQL with MS Visual Studio and the
student is to research how the application or website development language of
their choice integrates with MySQL. This research is encouraged earlier in the
module as it is key in the final deliverable of this module.

The following are some of the domain areas that a database and an
application/website can be developed:
 Health record management
 Student record management
 Library record management
 Online shopping website
 Online reservation website

Module aim
The primary aim of this module is to provide students with an understanding of
relational and distributed databases in relation to applications.

Module abstract
The scope of this module covers practical application of a database system
with third party applications. The outcome is to demonstrate expertise in

© CTI Education Group


Introduction Page 2

integrating a database system with a web-based or an independent


application. The module ultimately exhibits the student‟s gained expertise in
the information technology sphere by applying knowledge into a practical
reality. The scope promotes the creativity of students and launches them into
an innovation mode that prepares them for their career ahead. Today‟s world
embraces technically innovate students that can offer innovate solutions to the
challenges facing industries, modernise current processes to improve
efficiencies and transform the ordinary into the extraordinary. This module
tends to require students to exhibit skill and integrate a database system with
an application or web site.

Learning outcomes and assessment criteria


On successful completion of this module, you will:

1. Demonstrate a thorough understanding of principles, functioning and


applications of distributed database management systems.

2. Compare and contrast database recovery, concurrency control, security


and data integrity measures for centralised and distributed databases.

3. Integrate a database and a software application or website.

© CTI Education Group


Introduction Page 3

The following table outlines the assessment criteria that are aligned to the
learning outcomes.

Summary of learning outcomes and assessment criteria


Learning outcomes Assessment criteria to pass
On successful completion of this
You can:
module you will:
1.1 Demonstrate how to use entity-relationship
(ER) modelling in database design as well as
1. Demonstrate a thorough
the basic concepts associated with the ER
understanding of principles,
model.
functioning and applications of
1.2 Demonstrate how to use relational integrity
distributed database management
rules, including entity and referential
systems.
integrity in a distributed database
management system.
2.1 Discuss the scope of database security;
compare and contrast the types of threat
that may affect a distributed database
system.
2.2 Compare and contrast a range of computer-
based controls that are available as
countermeasures to such threats.
2. Compare and contrast database
2.3 Compare and contrast security measures
recovery, concurrency control,
associated with database systems and the
security and data integrity measures
web.
for centralised and distributed
2.4 Compare and contrast concurrency controls
databases.
and examine the protocols that can be used
to prevent conflict.
2.5 Compare and contrast database recovery
options and examine the techniques that can
be used to ensure a distributed database
remains in a consistent state in the presence
of failures.
3.1 Produce an optimised logical and physical
design for a database of advanced
complexity.
3. Integrate a database with a software
3.2 Develop and build the relational database.
application or website
3.3 Design, develop and build a third part
application that interfaces with the
database.

These outcomes are covered in the module content and they are assessed in
the form of a written assignment and practical. If you comply with and achieve
all the pass criteria related to the outcomes, you will pass this module.

Learning and assessment may be performed across modules, at module level


or at outcome level. Evidence may be required at outcome level, although
opportunities exist for covering more than one outcome in an assignment.

© CTI Education Group


Introduction Page 4

Module content
1. Demonstrate a thorough understanding of principles,
functioning and applications of distributed database
management systems.

Entity-relationship (ER) modelling in database design.


 Creating entities; Primary and foreign keys; Determining relationships;
Hierarchies; Inheritance; Reference data

Basic concepts associated with the ER model


 Recursive relationships; Mandatory relationships; Optional relationships

Using relational integrity rules, including entity and referential integrity in a


distributed database management system
 Business rules; Enforcing business rules; Using triggers; Using
validation rules; Locks; Transactions; Views

2. Compare and contrast database recovery, concurrency control,


security and data integrity measures for centralised and
distributed databases.

Database security and types of threats in a distributed database


management system
 Security; Database access; User privileges and access levels; Roles;
Types of threats in a DDMS environment

Computer-based controls available as countermeasures threats in a


distributed database management system
 Proactive measures to countermeasure threats; Physical and logical
design countermeasure considerations

Security measures associated with database systems and the web


 Security integration measures at various levels

Concurrency controls and protocols that can be used to prevent conflict


 Concurrency controls to prevent conflict; Concurrency protocols to
prevent conflict

Database recovery options


 Database recovery models; Data backup options for recovery; Database
design for recovery; Application design for recovery; Infrastructure
design for recovery

© CTI Education Group


Introduction Page 5

3. Integrate a database and a software application or website.

Optimised logical and physical design for a database of advanced


complexity
 Optimised logical database design; Optimised physical database design

Develop and build the relational database


 Practical – develop and build relational database using all the learnt
concepts covered in this module

Design, develop and build a third party application that interfaces with the
database
 Practical - Integration of applications
 Practical – develop and build a third party application that integrates
with the database

Lectures
Each week has four compulsory lecture hours for all students. It is
recommended that the lecture hours be divided into two sessions of two
hours each, but this may vary depending on the campus.

Each week has a lecture schedule which indicates the approximate time that
should be allocated to each activity. The week‟s work schedule has also been
divided into two lessons.

Class exercises and activities


Students will be required to complete a number of exercises and activities in
class. These activities and exercises may also contribute to obtaining pass,
merit or distinction criteria; it is therefore important that students are present
in class so that they do not forfeit the opportunity to be exposed to such
exercises and activities.

Activity sheets that are handed in should be kept by the lecturer so that they
can be used as proof of criteria that were met, if necessary.

Information resources
You should have access to a resource centre or library with a wide range of
relevant resources. Resources can include textbooks, e-books, newspaper
articles, journal articles, organisational publications, databases, etc. You can
access a range of academic journals in electronic format via EBSCOhost. You
will have to ask a campus librarian to assist you with accessing EBSCOhost.

© CTI Education Group


Introduction Page 6

Prescribed textbook
There is no prescribed book for this module.

Recommended information sources


Books
Bell, D. & Grimson, J. 1992. Distributed database systems. Harlow: Addison-
Wesley.

Bernstein, P. A.; Hadzilacos, V. & Goodman, N. 1988. Concurrency control and


recovery in database systems. Reading, MA: Addison Wesley.

Castano, S.; Fugini, M.; Martella, G. & Samarati, P. 1995. Database security.
Reading, Mass: Addison-Wesley.

Chin, F. & Ozsoyoglu, G. 1981. Statistical database design. ACM Trans.


Database Systems, 6(1):113-139.

Codd, E.F. 1982. The 1981 ACM Turing award lecture: relational database: a
practical foundation for productivity. Comm. ACM, 25(2):109-117.

Connolly, T. & Begg, C. 2015. Database systems: a practical approach to


design, implementation and management. 6th edition Global Edition. Harlow:
Pearson Education Limited.

Connolly, T.; Begg, C. & Holowczak, R. 2008. Business database systems. New
York: Addison-Wesley.

Davies Jr., J.C. 1973. Recovery semantics for a DB/DC system. Proc. ACM
Annual Conf., 136-141.

Freytag, J. C.; Maier, D. & Vossen, G. 1994. Query processing for advanced
database systems. San Mateo, CA: Morgan Kaufmann.

Gary, J.N. 1981. The transaction concept: virtues and limitations. Proc. Int.
Conf. Very Large Data Bases, 144-154.

Knapp, E. 1987. Deadlock detection in distributed databases. ACM Computing


Surv., 19(4):303-328.

Patrick, J. J. 2002. SQL fundamentals. 2nd edition. Upper Saddle River, NJ:
Prentice Hall.

Rob, P. & Coronel, C. 2012. Database systems: design, implementation and


management. 10th edition. Boston: Course Technology. Thomson Learning.

© CTI Education Group


Introduction Page 7

Schmidt, J. & Swenson, J. 1975. On the semantics of the relational model.


Proc. ACM SIGMOD Int. Conf. on Management of Data, 9-36.

Stonebraker, M. 1979. Concurrency control and consistency of multiple copies


of data in distributed INGRES. IEEE Trans. Software Engineering, 5(3):180-
194.

Wertz, C.J. 1993. Relational database design: a practitioner’s guide. New York:
CRC Press.

Yu, C. 1997. Principles of database query processing for advanced applications.


San Francisco, CA: Morgan Kaufmann.

Websites
http://blog.winhost.com

http://www.abanet.org/scitech/ec/isc/dsg-tutorial.html

https://www.beastnode.com

http://www.computerprivacy.org/who/

http://www.cve.mitre.org

http://dev.mysql.com/doc/refman/5.5/en/index.html

http://www.mysqltutorial.org/

http://tpc.org

Note
 Web pages provide access to a further range of Internet information sources.
 Lecturers may download the web-related material for students to access offline.
 Students must use this resource with care, justifying the use of information gathered.

Software
MySQL 5.6 database software or higher is to be used.
Follow the following steps:
 Download MySQL for Windows from http://dev.mysql.com/downloads
 Install MySQL with an administrator account:
o Install MySQL on Windows using the Microsoft Software Installer Package:
 Download and start the MySQL Installation Wizard
 Click Custom Installation > OK
o Install MySQL additional optional components:
 ODBC driver
 Connecter/net driver
 Disable antivirus scanning on the main MySQL data directory (datadir)

© CTI Education Group


Introduction Page 8

 Disable antivirus scanning on the temporary MySQL data directory (tmpdir)

MySQL has both a graphics user interface (GUI) and command line interface
(CMD).

Using this Study Guide


The study guide is a source of information for this module. This module is
research-based and alternative sources are to be used for this module.

The module outline must be read in conjunction with the study guide and
prescribed textbook (if applicable). This document will be the first port of call
in understanding what will be assessed and which assessments form part of
the module.

The purpose of the module outline is to highlight:


 The learning outcomes and assessment criteria that need to be met to pass
the module
 The assessment required to be completed for the module
 The additional resources required for the module
 The topics that will be focused on for the module

The purpose of the Study Guide is to facilitate your learning and help you to
master the content of the module. It helps you to structure your learning and
manage your time; provides outcomes and activities to help you master said
outcomes.

The Study Guide has been carefully designed to optimise your study time and
maximise your learning, so that your learning experience is as meaningful and
successful as possible. To deepen your learning and enhance your chances of
success, it is important that you read the Study Guide attentively and follow all
the instructions carefully. Pay special attention to the course outcomes at the
beginning of the Study Guide and at the beginning of each unit.

It is essential that you complete the exercises and other learning activities in
the Study Guide as your course assessments (practical and assignment) will be
based on the assumption that you have completed these activities.

The Study Guide should be read in conjunction with different research sources.

Purpose
The purpose of the Study Guide is to facilitate the learning process and to help
you to structure your learning and to master the content of the module. The
alternative research material covers most areas in detail.

© CTI Education Group


Introduction Page 9

Where applicable, we give more simplified explanations in the Study Guide. It


is important for you to work through research and the Study Guide attentively
and to follow all the instructions set out in the Study Guide. In this way, you
should be able to deepen your learning and enhance your chances of success.

Structure
The Study Guide is structured as follows:

Introduction
Unit 1: Principles, functions and applications of a DDMS
Unit 2: Database security, concurrency control and recovery
Unit 3: Integrate a database with an application
Glossary
Bibliography

Individual units
The individual units in the Study Guide are structured in the same way and
each unit contains the following features, which should enhance your learning
process:

Each unit title is based on the title and content of a specific


Unit title outcome or assessment criterion (criteria) as discussed in the
unit.
The unit title is followed by an outline of the learning
outcomes and assessment criteria, which will guide your
Learning outcomes and learning process. It is important for you to become familiar
assessment criteria with the learning outcomes and assessment criteria, because
they represent the overall purpose of the module as well as
the end product of what you should have learnt in the unit.
Learning objectives, which follow the learning outcomes and
assessment criteria, are statements that define the expected
goal of the unit in terms of the specific knowledge and skills
Learning objectives that you should acquire as a result of mastering the unit
content. Learning objectives clarify, organise and prioritise
learning and they help you to evaluate your own progress,
thereby taking responsibility for your learning.
The prescribed reading section is followed by an introduction
Introduction:
that identifies the key concepts of the unit.
The content of each unit contains the theoretical foundation of
the module and is based on the work of experts in the field of
Content:
this module. The theory is illustrated by means of relevant
examples.
The concluding remarks at the end of each unit provide a brief
Concluding remarks summary of the unit as well as an indication of what you can
expect in the following unit.
The unit ends off with a number of theoretical self-assessment
Self-assessment
questions that test your knowledge of the content of the unit.

© CTI Education Group


Introduction Page 10

Glossary
As you can see, we include a glossary at the end of the Study Guide. Please
refer to it as often as necessary in order to familiarise yourself with the exact
meaning of terms and concepts involved in Advanced Database Systems.

The use of icons


Icons are used to highlight (emphasise) particular sections or points in the
Study Guide, to draw your attention to important aspects of the work, or to
highlight activities. The following icons are used in the Study Guide:

Definition
This icon appears when definitions of a particular term or concept
are given in the text.

Example
This icon points to a section in the text where relevant examples
for a particular topic (theme) or concept are provided.

Learning outcome alignment


This icon is used to indicate how individual units in the Study Guide
are aligned to a specific outcome and its assessment criteria.

Test your knowledge


This icon appears at the end of each unit in the Study Guide,
indicating that you are required to answer self-assessment
questions to test your knowledge of the content of the foregoing
unit.

Alignment to prescribed textbook


This module does not have a prescribed textbook.

Concluding remarks
At this point, you should be familiar with the module design and structure as
well as with the use of the Study Guide.

© CTI Education Group


Unit 1: Principles, functions and applications of a DDMS Page 11

Unit 1: Principles, functions and applications of a


DDMS

Unit 1 is aligned with the following learning outcomes and


assessment criteria:

Learning outcomes:
LO1: Demonstrate a thorough understanding of principles,
functioning and applications of distributed database
management systems

Assessment criteria:
AC1.1: Demonstrate how to use entity-relationship (ER) modelling in
database design as well as the basic concepts associated
with the ER model
AC1.2: Demonstrate how to use relational integrity rules, including
entity and referential integrity in a distributed database
management system

Learning objectives
After studying this unit, you should be able to:
 Understand basic entity-relationship model concepts
 Understand and design an entity-relationship model
 Discuss the relational integrity rules, including entity integrity and
referential integrity

Introduction
This unit introduces the concepts behind the relational model, the most popular
data model at present, and the one most often chosen for standard business
applications. The relational integrity rules, entity integrity, views and
referential integrity are discussed.

1.1 Entity-Relationship (ER) modelling


This section introduces the basic concepts of the ER model: entities,
relationships and attributes. There will also be an illustration of how basic ER
concepts are represented graphically, in an ER diagram using UML. We
differentiate between weak and strong entities and discuss how attributes
normally associated with entities can be assigned to relationships. You will also
come to understand the structural constraints associated with relationships.

Below is an explanation of how to create ER models using alternative


notations.

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 12

Figure 1 – Chen notation for ER modelling

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 13

Figure 1 cont.
Source: Connolly & Begg (2015:C-2–C-3)

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 14

The following diagram, Figure 2, has been derived from the DreamHouse Case
Study from Connelly and Begg (2015: Section 11.4). It shows the Chen
notation.

The scenario depicting the below diagram has been summarised as follows:
 A Supervisor who is a Staff member manages a Branch.
o In this relationship, the Branch‟s existence is mandatory in order for the
Supervisor staff to be able to manage.
 A Supervisor whose is a Staff member supervises many Supervisees who
are also Staff members.
 A Branch has many Staff members.
o In this relationship, the Branch and Staff‟s existence is mandatory in
order for the relationship to exist.
 A Staff member registers Clients.
 A Branch registers Clients.
o In this relationship, the Branch and Client‟s existence is mandatory in
order for the registration to take place.
 A Client states their Preference.
o In this relationship, the Client and Preference‟s existence is mandatory
in order for the relationship to exist.

Figure 2 – ERD using Chen notation


Source: Connolly & Begg (2015:C-4)

Crow’s foot notation of ER modelling


Crow‟s foot notation differs from Chen‟s notation in the following ways:
 Chen‟s notation depicts a conceptual model as compared to Crow‟s foot
that depicts an implementation-oriented model.
 Chen‟s notation depicts a relationship as a diamond yet Crow‟s foot depicts
it as a line.

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 15

 Chen‟s notation tends to be space consuming as it depicts an entity and


attributes as separate diagrams as compared to Crow‟s foot that depicts all
these in one entity diagram.
 Crow‟s foot notation can only detail cardinality of 0,1,N as compared to
Chen‟s notation that can detail more cardinality.
 Chen‟s notation is a bit confusing as compared to Crow‟s foot which is easy
to understand.

Chen‟s notation is no longer dominant as many commercial modelling tools are


now using Crow‟s foot notation.

Figure 3 – Crow’s foot notation for ER modelling


Source: Connolly & Begg (2015:C-4)

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 16

Figure 3 cont.
Source: Connolly & Begg (2015:C-5)

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 17

The following diagram, Figure 4, has been derived from the DreamHouse Case
Study from Connelly & Begg (2015: Section 11.4). It shows the equivalent
Crow‟s foot notation of Figure 2.

Figure 4 – ERD using Crow’s foot notation


Source: Connolly & Begg (2015:C-6)

The following has to be noted when constructing an ERD using Crow‟s foot
notation:
 The name of entity is normally a singular noun.
 The first letter of entity should be uppercase.
 The relationship name should describe its function, preferably a verb or a
short phrase including a verb.
 Each word in a relationship name‟s first letter should be capitalised.
 The relationship should be named in one direction making sense.

The following diagram, Figure 5, has been derived from the DreamHouse Case
Study from Connelly & Begg (2015: Section 11.4). It shows the equivalent
Crow‟s foot notation of Figure 2.

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 18

Figure 5 – Enhanced ERD


Source: Connolly & Begg (2015:444)

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 19

The definitions below are according to Connolly and Begg (2015: Section 12)
and Chen (1976). The DreamHouse Case Study from Connelly and Begg
(2015: Section 11.4) can be referred to for all the below concepts.

 An entity type is a group of objects with the same properties, which are
identified by the enterprise as having an independent existence. An entity
occurrence is a uniquely identifiable object of an entity type.
o For example in Figure 5, entities can be of entity type Physical existence
– Supervisor, Client, Owner.

 A relationship type is a set of meaningful associations among entity


types.
o For example in Figure 5, the relationship Owns associates the Owner
and PropertyForRent entities.

 A relationship occurrence is a uniquely identifiable association, which


includes one occurrence from each participating entity type.
o For example in Figure 5, a relationship occurrence is Branch Has Staff.

 The degree of a relationship type is the number of participating entity


types in a relationship.
o For example in Figure 5, a ternary relationship is involves a degree of
three entities. A Staff registers a Client at a Branch involves three
entities.

 A recursive relationship is a relationship type where the same entity type


participates more than once in different roles.
o For example in Figure 4, a Staff (Supervisor) supervises Staff
(Supervisee).

 An attribute is a property of an entity or a relationship type.


o For example in Figure 5, a Client entity has a clientNo attribute.

 An attribute domain is the set of allowable values for one or more


attributes.
o For example in Figure 5, a BranchNo can be within the specified
numbers defined (Brn001, Brn002, Brn003) and any branchNo that is
outside these will be invalid.

 A simple attribute is composed of a single component with an


independent existence.

 A composite attribute is composed of multiple components each with an


independent existence.

 A single-valued attribute holds a single value for each occurrence of an


entity type.

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 20

 A multi-valued attribute holds multiple values for each occurrence of an


entity type.

 A derived attribute represents a value that is derivable from the value of


a related attribute or set of attributes, not necessarily in the same entity.

 A candidate key is the minimal set of attributes that uniquely identifies


each occurrence of an entity type.

 A primary key is the candidate key that is selected to uniquely identify


each occurrence of an entity type.
o For example in Figure 5, BranchNo is the primary key for Branch the
entity branch.

 A composite key is a candidate key that consists of two or more


attributes.

 A strong entity type is not existence-dependent on some other entity


type. A weak entity type is existence dependent on some other entity
type.

 Multiplicity is the number (or range) of possible occurrences of an entity


type that may relate to a single occurrence of an associated entity type
through a particular relationship.

 Multiplicity for a complex relationship is the number (or range) of


possible occurrences of an entity type in an n-ary relationship when the
other (n−1) values are fixed.

 Cardinality describes the maximum number of possible relationship


occurrences for an entity participating in a given relationship type.

 Participation determines whether all or only some entity occurrences


participate in a given relationship.

 A fan trap exists where a model represents a relationship between entity


types, but the pathway between certain entity occurrences is ambiguous.

 A chasm trap exists where a model suggests the existence of a


relationship between entity types, but the pathway does not exist between
certain entity occurrences.

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 21

1.2 Database relations

Definition
We can define a relation schema as a named relation defined by
a set of attribute and domain name pairs. Relational database
schema is a set of relation schemas, each with a distinct name.
(Connolly & Begg, 2015:156).

1.2.1 Properties of relations


According to Connolly and Begg (2015:156–157), a relation has the following
properties:
 The relation name is distinct from all other relation names in the relational
schema.
 Each cell of a relation contains exactly one atomic (single) value.
 Each attribute has a distinct name.
 Values of an attribute are all from the same domain.
 Each tuple is distinct; there are no duplicate tuples.
 The order of attributes has no significance.
 The order of tuples has no significance, theoretically.

1.2.2 Relational keys


There are no duplicate tuples within a relation. Therefore, we need to be able
to identify one or more attributes (called relational keys) that uniquely
identifies each tuple in a relation (Connolly & Begg, 2015:158).

In this section, we explain the terminology used for relational keys by


Connolly et al. (2015:158–159):

Super key
 An attribute or set of attributes that uniquely identifies a tuple within a
relation

Candidate key
 Super key (K), such that no proper subset is a super key within the relation
 In each tuple of R, values of K uniquely identify that tuple (uniqueness)
 No proper subset of K has the uniqueness property (irreducibility)

Primary key
 The candidate key selected to identify tuples uniquely within the relation
 Alternate keys
 Candidate keys that are not selected to be the primary key
 Foreign key
 An attribute or set of attributes within one relation that matches a
candidate key of some (possibly same) relation

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 22

1.2.3 Representing relational database schemas


The common convention for representing a relation schema is to give the name
of the relation, followed by the attribute names, in parenthesis. Normally, the
primary key is underlined. The conceptual model, or conceptual schema, is the
set of all such schemas for the database.

A relational database consists of any number of normalised relations. The


relational schema for part of the DreamHome case study (Connolly & Begg,
2015:536–537) is:

Staff (staffNo, fName, lName, position, sex, DOB,


supervisorStaffNo)
PropertyForRent (propertyNo, street, city, postcode, type, rooms, rent,
ownerNo, staffNo)
Client (clientNo, fName, lName, telNo, prefType, maxRent, eMail,
staffNo)
PrivateOwner (ownerNo, fName, lName, address, telNo)
BusinessOwner (ownerNo, bName, bType, contactName, address, telNo)
Viewing (clientNo, propertyNo, dateView, comment)
Lease (leaseNo, paymentMethod, depositPaid, rentStart,
rentFinish, clientNo, propertyNo)

1.3 Integrity constraints


A data model has two other parts: a manipulative part, defining the types of
operation that are allowed on the data and a set of integrity constraints, which
ensures that the data is accurate. Because every attribute has an associated
domain, there are constraints (domain constraints) that form restrictions on
the set of values allowed for the attributes of relations. In addition, there are
two important integrity rules, which are constraints or restrictions that apply to
all instances of the database (Connolly & Begg, 2015:161).

The two principal rules for the relational model are known as entity integrity
and referential integrity. Other types of integrity constraint are multiplicity and
general constraints. Before we define entity and referential integrity, it is
necessary to understand the concept of nulls (Connolly & Begg, 2015:161).

1.3.1 Null
This represents value for an attribute that is currently unknown or not
applicable for tuple, deals with incomplete or exceptional data and also
represents the absence of a value and is not the same as zero or spaces, which
are values (Connolly & Begg, 2015:161).

The below is an example of how the Null integrity constraint is defined in


MySQL. The “Not Null” checkbox is not checked for an attribute and “NULL” is
specified as the Default value.

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 23

Figure 6 – Defining Null Integrity constraint in MySQL


Source: Nyasha Magutsa

1.3.2 Entity integrity

Definition
The first integrity rule applies to the primary keys of base
relations. For the moment, we will define a base relation as a
relation that corresponds to an entity in the conceptual schema
(Connolly & Begg, 2015:162).

Entity integrity states that, in a base relation, no attribute of a primary key can
be null. By definition a primary key is a minimal identifier that is used to
identify tuples uniquely. This means that no subset of the primary key is
sufficient to provide unique identification of tuples. If we allow a null for any
part of a primary key, we are implying that not all the attributes are needed to
distinguish between tuples, which contradicts the definition of the primary key
(Connolly & Begg, 2015:162).

The below is an example of how the Entity integrity constraint is defined in


MySQL. When the “Primary” checkbox is checked for an attribute, the “Not
Null” checkbox is automatically checked by the system. The system enforces
the rule for a Primary key not to be null.

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 24

Figure 7 – Defining Entity Integrity constraint in MySQL


Source: Nyasha Magutsa

1.3.3 Referential integrity


The second integrity rule applies to foreign keys. If a foreign key exists in a
relation, either the foreign key value must match a candidate key value of
some tuple in its home relation, or the foreign key value must be wholly null
(Connolly & Begg, 2015:162).

The below is an example of how the Referential integrity constraint is defined


in MySQL. The field has to be a primary key in table, for example the field
country_id being a primary key in the table Country.

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 25

Figure 8 – Defining Referential Integrity constraint in MySQL – Define primary key


Source: Nyasha Magutsa

This field country_id is then defined in another table City.

Figure 9 – Defining Referential Integrity constraint in MySQL – Define field in


another table
Source: Nyasha Magutsa

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 26

MySQL will require the field country_id to be defined and linked as a foreign
key in the table City in order for the system to implement this Referential
integrity constraint.

Below is how the foreign key is defined in the City table. The table where is
resides as a primary key – Country is referenced, the corresponding fields to
be linked are selected and the action on update or delete is specified.

Figure 10 – Defining Referential Integrity constraint in MySQL – Link the field as


Foreign key in the other table.
Source: Nyasha Magutsa

1.3.4 General constraints


Additional rules specified by users or database administrators that define or
constrain some aspect of the enterprise. It is also possible for users to specify
additional constraints that the data must satisfy (Connolly & Begg, 2015:163).
These constraints can be implemented using triggers.

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 27

Figure 11 – Defining General Integrity constraint in MySQL – Specific postal code


values are accepted
Source: Nyasha Magutsa

1.4 Integrity enhancement feature


We will be considering five types of integrity constraints, as identified by
Connolly and Begg (2015: Section 7.2):
 Required data
 Domain constraints
 Entity integrity
 Referential integrity
 General constraints

These constraints can be defined in the CREATE and ALTER TABLE statements.

1.4.1 Required data


Some columns must contain a valid value; they are not allowed to contain
nulls. A null is distinct from blank or zero, and is used to represent data that is
either not available, missing or not applicable. The ISO standard provides the
NOT NULL column specifier in the CREATE and ALTER TABLE statements to
provide this type of constraint. When NOT NULL is specified, the system rejects
any attempt to insert a null in the column. If NULL is specified, the system
accepts nulls. The ISO default is NULL. For example, to specify that the column
position of the Staff table cannot be null, we define the column as:

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 28

position VARCHAR(10) NOT NULL


(Connolly & Begg, 2015:240)

1.4.2 Domain constraints


Every column has a domain. This is a set of legal values. For example gender
domain has male and female as legal values. The ISO standard provides two
mechanisms for specifying domains in CREATE and ALTER TABLE statements.
The first is the CHECK clause, which allows a constraint to be defined on a
column or the entire table. The format of the CHECK clause is:

CHECK (searchCondition)

However, the ISO standard allows domains to be defined more explicitly using
the CREATE DOMAIN statement:

CREATE DOMAIN DomainName [AS] data type


[DEFAULT defaultOption]
[CHECK (searchCondition)]

A domain is given a name, DomainName, a data type, an optional default


value, and an optional CHECK constraint. Thus, we could define a domain sex
as:

CREATE DOMAIN SexType AS CHAR


DEFAULT „M‟
CHECK (VALUE IN („M‟, „F‟));

This definition creates a domain SexType that consists of a single character,


with either „M‟ or „F‟. When defining sex, we can now use the domain name
SexType in place of the data type CHAR:

Sex SexType NOT NULL

searchCondition can involve a table lookup:

CREATE DOMAIN BranchNo AS CHAR(4)


CHECK (VALUE IN (SELECT branchNo FROM Branch));

Domains can be removed using DROP DOMAIN:


DROP DOMAIN DomainName [RESTRICT | CASCADE]
(Connolly & Begg, 2015:240–241)

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 29

1.4.3 Entity integrity


The primary key (PK) of a table must contain a unique, non-null value for each
row. ISO standard supports the FOREIGN KEY clause in CREATE and ALTER
TABLE statements:

PRIMARY KEY(staffNo)
PRIMARY KEY(clientNo, propertyNo)

There can only be one PRIMARY KEY clause per table. However, it is still
possible to ensure uniqueness for alternate keys using UNIQUE:

UNIQUE(telNo)
(Connolly & Begg, 2015:241–242)

1.4.4 Referential integrity


A foreign key (FK) is a column or set of columns that link/s each row in a child
table containing a foreign FK to the row of a parent table containing a
matching PK. Referential integrity means that, if the FK contains a value, that
value must refer to existing row in the parent table. ISO standard supports
definitions of FKs with the FOREIGN KEY clause in the CREATE and ALTER
TABLE:

FOREIGN KEY (branchNo) REFERENCES Branch


(Connolly & Begg, 2015:242–243)

Any INSERT/UPDATE attempting to create an FK value in a child table without


matching the CK value in the parent is rejected. The action taken to attempt to
update/delete a CK value in a parent table with matching rows in a child is
dependent on the referential action specified using the ON UPDATE and ON
DELETE sub-clauses:
 CASCADE: Delete row from parent and delete matching rows in child, and
so on, in a cascading manner
 SET NULL: Delete row from parent and set FK column/s in child to NULL.
Only valid if FK columns are NOT NULL
 SET DEFAULT: Delete row from parent and set each component of FK in
child to specified default. Only valid if DEFAULT is specified for FK columns
 NO ACTION: Reject delete from parent. Default
(Connolly & Begg, 2015:242–243)

1.4.5 General constraints


General constraints could use CHECK/UNIQUE in a CREATE and ALTER TABLE.
Similar to the CHECK clause, general constraints also have:

CREATE ASSERTION AssertionName


CHECK (searchCondition)
(Connolly & Begg, 2015:243)

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 30

1.5 Views, triggers and stored procedures


1.5.1 Views
It is a mechanism that allows a user to retrieve specific information in a
database. A user can selectively view a sub set of information in the database
in a specific format.

This can be used to enforce certain business rules such as filtering information
relevant only to specific users. Such information that is classified as
confidential can be made available to authorised users through the use of
Views.

Its benefits according to Connelly & Begg (2015: Section 4.4) include the
following:
 Views provides a level of security
 Views provide a mechanism to customise the appearance of the database
 Views can present a consistent, unchanging picture of the structure of the
database

Below illustrates the creation of a View in MySQL.

Figure 12 – Creation of View


Source: Nyasha Magutsa

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 31

Figure 13 – Creation of View


Source: Nyasha Magutsa

Useful website
For additional resources on Views, access the following link.
http://www.mysqltutorial.org/mysql-views-tutorial.aspx

1.5.2 Triggers
A trigger has the ability to enforce specified business rules as it is applicable to
certain events occurring in a table.

It defines an action that the database should take when a certain event occurs.
It can be used to enforce referential integrity constraints or audit data
changes.

Triggers are applicable in the following events: INSERT, UPDATE or DELETE.


For these events, a trigger can implement a course of action either AFTER or
BEFORE the event.

Advantages of triggers include:


 Eliminating redundant code
 Simplifying modifications
 Increase security
 Improved integrity

Disadvantages of triggers include:


 Performance overheads

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 32

 Cascading effects
 Cannot be scheduled
 Less portable

Below illustrates the creation of a trigger in MySQL. Select a Timing/Event


where the Trigger will be created.

Figure 14 – Select a Timing/Event


Source: Nyasha Magutsa

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 33

Figure 15 – Create the Trigger using SQL code


Source: Nyasha Magutsa

Useful website
For additional resources on Triggers, access the following link.
http://www.mysqltutorial.org/mysql-triggers.aspx

1.5.3 Stored procedures


A stored procedure is a set of declarative code segment that can accept
parameters as input and can return output. Stored procedure usually contain a
set of code that is frequently used, hence it offers code reusability.

Advantages of stored procedures include:


 Increases performance
 Reduces instructions send to the database as all commands are bundled in
one name of the stored procedure
 There is reusable code

Disadvantages of stored procedures include:


 Many stored procedures being executed concurrently can increase memory
use substantially.
 Construction of complex business logic can be difficult.
 It is not easy to construct and debug stored procedures.

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 34

Below illustrates the creation of a stored procedure in MySQL.

Figure 16 – Create stored procedure


Source: Nyasha Magutsa

Highlighted below are the parameters to be input and output from the stored
procedure.

Figure 17 – Write stored procedure SQL code


Source: Nyasha Magutsa

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 35

Useful website
For additional resources on Stored Procedures, access the
following link.
http://www.mysqltutorial.org/mysql-stored-procedure-
tutorial.aspx

Concluding remarks
In this unit we discussed the relational integrity rules, such as entity and
referential integrity rules. Also a practical insight on the implementation of
these concepts is highlighted.

1.6 Self-assessment

Test your knowledge

1. Use MySQL to create the DreamHome database. Use the relational schema
of the DreamHome case study (Connolly & Begg, 2015) specified in Section
17.1 – Figure 17.3.

In your database make sure that the following are included:


1.1 Primary and foreign keys
1.2 Business rules
a. General constraints
b. Domain constraints
1.3 Views
1.4 Triggers
1.5 Stored procedures

CTI Education Group ©


Unit 1: Principles, functions and applications of a DDMS Page 36

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 37

Unit 2: Database security, concurrency control and


recovery

Unit 2 is aligned with the following learning outcomes and


assessment criteria:

Learning outcomes:
LO2: Compare and contrast database recovery, concurrency
control, security and data integrity measures for centralised
and distributed databases

Assessment criteria:
AC2.1: Discuss the scope of database security; compare and
contrast the types of threat that may affect a distributed
database system
AC2.2: Compare and contrast a range of computer-based controls
that are available as countermeasures to such threats.
AC2.3: Compare and contrast security measures associated with
database systems and the web.
AC2.4: Compare and contrast concurrency controls and examine
the protocols that can be used to prevent conflict
AC2.5: Compare and contrast database recovery options and
examine the techniques that can be used to ensure a
distributed database remains in a consistent state in the
presence of failures

Learning objectives
After studying this unit, you should be able to:
 Discuss the scope of database security.
 Identify computer-based threats and their countermeasures.
 Identify the security measures associated with DBMS and the Web.
 Distinguish concurrent controls and protocols that prevent conflict.
 Distinguish database recovery options that can be used in a failure state.

Introduction
This unit considers database security and recovery. Security considers both the
DBMS and its environment. It illustrates security provision with MySQL. The
unit also examines the security problems that can arise in a Web environment
and presents some approaches to overcoming them.

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 38

In this unit, students will learn:


 The scope of database security
 Why database security is a serious concern for an organisation
 The type of threats that can affect a database system
 How to protect a computer system using computer-based controls
 The security measures provided by MySQL
 Approaches for securing a DBMS on the Web
 Concurrency control
 Deadlock and how it can be resolved
 How timestamping can ensure serialisability
 Optimistic concurrency control
 Recovery control
 Some causes of database failure
 The purpose of a transaction log file
 The purpose of checkpointing
 How to recover following a database failure

2.1 Database security


Data is a valuable resource that must be strictly controlled and managed, as
with any corporate resource. Part or all of the corporate data may have
strategic importance and, therefore, needs to be kept secure and confidential.
Database security refers to the mechanisms that protect the database against
intentional or accidental threats. Security considerations do not only apply to
the data held in a database. Breaches of security may affect other parts of the
system, which may in turn affect the database (Connolly & Begg, (2015:607);
Castano et al., 1995).

Connolly and Begg (2015:608) and Castano et al (1995) consider database


security in relation to the following situations:
 Theft and fraud
 Loss of confidentiality (secrecy)
 Loss of privacy
 Loss of integrity
 Loss of availability

2.1.1 Threat
This refers to any situation or event, whether intentional or unintentional, that
will adversely affect a system and consequently an organisation (Connolly &
Begg, 2015:609).

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 39

Figure 18 – Summary of threats to computer systems


Source: Connolly & Begg (2015:611)

2.2 Countermeasures: computer-based controls


According to Connolly and Begg (2015: Section 20.2), the types of counter
measures to threats on computer systems are concerned with physical controls
to administrative procedures, and include:
 Authorisation
 Access controls
 Views
 Backup and recovery
 Integrity
 Encryption
 RAID technology

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 40

2.2.1 Authorisation
It is the granting of a right or privilege, which enables a subject to legitimately
have access to a system or a system‟s object. Authorisation is a mechanism
that determines whether a user is who he or she claims to be (Connolly &
Begg, 2015:612).

2.2.2 Access control


Access control is based on the granting and revoking of privileges. A privilege
allows a user to create or access (that is, read, write or modify) some database
object (such as a relation, view and index) or to run certain DBMS utilities.
Privileges are granted to users to accomplish the tasks required for their jobs.
Most DBMSs provide an approach called discretionary access control (DAC).
SQL standard supports DAC through the GRANT and REVOKE commands. The
GRANT command gives privileges to users, and the REVOKE command takes
away privileges (Connolly & Begg, 2015: Section 20.2.2).

DAC, while effective, has certain weaknesses. In particular, an unauthorised


user can trick an authorised user into disclosing sensitive data. An additional
approach that is required is called mandatory access control (MAC). DAC,
which is based on system-wide policies, cannot be changed by individual users.
Each database object is assigned a security class and each user is assigned a
clearance for a security class; in addition, rules are imposed on reading and
writing of database objects by users (Connolly & Begg, 2015:614).

DAC determines whether a user can read or write an object based on rules that
involve the security level of the object and the clearance of the user. These
rules ensure that sensitive data can never be „passed on‟ to another user
without the necessary clearance. The SQL standard does not include support
for MAC (Connolly & Begg, 2015:614).

2.2.3 View
A view is the dynamic result of one or more relational operations operating on
the base relations to produce another relation. A view is a virtual relation that
does not actually exist in the database, but is produced upon request by a
particular user, at the time of request (Connolly & Begg, 2015:616).

2.2.4 Backup and recovery


This is the process of periodically taking a copy of the database and log file
(and possibly programs) to offline storage media. A DBMS should provide
backup facilities to assist with recovery of a database following failure
(Connolly & Begg, 2015:616–617).

Backup and recovery can be handled using two options. The graphic user
interface and the command line.

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 41

Graphic user interface backup


The Data Export and Data Import/Restore utilities are used to backup or
restore a MySQL database.

Figure 19 – Creating a backup/restoring a backup file


Source: Nyasha Magutsa

Useful website
For additional resources on Graphic user interface backup or
restoration, access the following link.
https://www.beastnode.com/portal/knowledgebase/48/MySQL-
Workbench-Backup-and-Import-your-MySQL-Database.html

Command line backup


Below are the command lines on how to backup and restore MySQL database
using mysqldump utility:
backup: # mysqldump -u root -p[root_password] [database_name] >
dumpfilename.sql
restore:# mysql -u root -p[root_password] [database_name] <
dumpfilename.sql

Useful website
For additional resources on Command Line backup or
restoration, access the following links.
http://dev.mysql.com/doc/refman/5.5/en/mysqldump.html
http://blog.winhost.com/using-mysqldump-to-backup-and-
restore-your-mysql-databasetables/

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 42

2.2.5 Journaling
This is the process of keeping and maintaining a log file (or journal) of all
changes made to a database to enable effective recovery in the event of
failure. A DBMS should provide logging facilities, sometimes referred to as
journaling, which keep track of the current state of transactions and database
changes, to provide support for recovery procedures. The advantage of
journaling is that in the event of a failure, the database can be recovered to its
known consistent state using a backup copy of the database and the
information contained in the log file (Connolly & Begg, 2015:617).

2.2.6 Integrity
This prevents data from becoming invalid, and hence giving misleading or
incorrect results (Connolly & Begg, 2015:617).

2.2.7 Encryption
This refers to the encoding of the data by a special algorithm that renders the
data unreadable by any program without the decryption key. Encryption also
protects data transmitted over communication lines. There a number of
techniques for encoding data to conceal the information; some are termed
„irreversible‟ and others „reversible‟ (Connolly & Begg, 2015:617–618).

Irreversible techniques, as the name implies, do not permit the original data to
be known. However, the data can be used to obtain valid statistical
information. Reversible techniques are more commonly used. To transmit data
securely over insecure networks requires the use of a cryptosystem, which
includes:
 An encryption key to encrypt the data (plaintext)
 An encryption algorithm that, with the encryption key, transforms the
plaintext into ciphertext
 A decryption key to decrypt ciphertext
 A decryption algorithm that, together with the decryption key, transforms
the ciphertext back into plaintext

One technique, called symmetric encryption, uses the same key for both
encryption and decryption, and relies on safe communication lines for
exchanging the key. Another type of cryptosystem uses different keys for
encryption and decryption, and is referred to as asymmetric encryption
(Connolly & Begg, 2015:617–618).

2.2.8 Redundant Array of Independent Disks (RAID)


Hardware that the DBMS is running on must be fault-tolerant, meaning that
the DBMS should continue to operate even if one of the hardware components
fails. There is the need to have redundant components that can be seamlessly
integrated into the working system whenever there is one or more component
failure/s (Connolly & Begg, 2015: Section 20.2.7).

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 43

The main hardware components that should be fault-tolerant include disk


drives, disk controllers, the CPU, power supplies and cooling fans. Disk drives
are the most vulnerable components with the shortest times between failures
of any of the hardware components (Connolly & Begg, 2015: Section 20.2.7).

One solution is to provide a large disk array comprising an arrangement of


several independent disks that are organised to improve reliability and, at the
same time, increase performance (Connolly & Begg, 2015: Section 20.2.7).

Performance is increased through data stripping: The data is segmented into


equal-sized partitions (the stripping unit), which are transparently distributed
across multiple disks. Reliability is improved through storing redundant
information across the disks using a parity scheme or an error-correcting
scheme (Connolly & Begg, 2015: Section 20.2.7; Chen & Patterson, 1990).

According to Connolly and Begg (2015: Section 20.2.7) and Chen, Lee, Gibson,
Katz and Patterson (1994), there are a number of different disk configurations,
called RAID levels:
 RAID 0 Non-redundant
 RAID 1 Mirrored
 RAID 0+1 Non-redundant and mirrored
 RAID 2 Memory-style error-correcting codes
 RAID 3 Bit-interleaved parity
 RAID 4 Block-interleaved parity
 RAID 5 Block-interleaved distributed parity
 RAID 6 P+Q redundancy

2.3 Security in MySQL DBMS


MySQL provides methods for securing a database. These include:
 Setting a password for opening a database (system security)
 User-level security, which can be used to limit the parts of the database
that a user can read or update (data security)

2.3.1 Securing a MySQL database using a password


MySQL can be secured via the use of a username and password. Below is an
illustration of how it is set up.

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 44

Figure 20 – Securing a MySQL database using a password


Source: Nyasha Magutsa

2.3.2 Administrative roles


The defined users are assigned different roles. Below is an illustration of how it
is set up.

Figure 21 – Administrative roles


Source: Nyasha Magutsa

2.3.3 Global privileges


A selected role has associated global privileges. However more global
privileges can be added to a role. Below is an illustration of how it is set up.

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 45

Figure 22 – Global privileges


Source: Nyasha Magutsa

2.3.4 Setting the Insert, Select and Update privileges


A user profile can have associated schema privileges. Below is an illustration of
how it is set up.

Figure 23 – Setting the Insert, Select and Update privileges


Source: Nyasha Magutsa

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 46

2.3.5 Log On dialog box


Below is an illustration of the log on dialog box.

Figure 24 – Log on dialog box


Source: Nyasha Magutsa

2.4 DBMS and Web security


Internet communication relies on TCP/IP as the underlying protocol. However,
TCP/IP and HTTP were not designed with security in mind. Without special
software, all Internet traffic travels „in the clear‟ and anyone who monitors
traffic can read it (Connolly & Begg, 2015:627).

Connolly and Begg (2015:627) state that while transmitting information over
the Internet, one must ensure that:
 It is inaccessible to anyone but the sender and receiver (privacy)
 It is not changed during transmission (integrity)
 The receiver can be sure that it came from the sender (authenticity)
 The sender can be sure that the receiver is genuine (non-fabrication)
 The sender cannot deny that he or she sent it (non-repudiation)

Some measures or approaches required for securing DBMSs on the Web, as


recommended by Connolly and Begg (2015: Section 20.5), include the
following:
 Proxy servers
 Firewalls
 Message digest algorithms and digital signatures
 Digital certificates
 Kerberos
 Secure sockets layer (SSL) and Secure HTTP (S-HTTP)
 Secure electronic transactions (SET) and secure transaction technology
(SST)
 Java security
 ActiveX security

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 47

2.4.1 How Secure Electronic Transactions (SET) works

Figure 25 – How SET works


Source: Connolly & Begg (2015:632)

2.5 Concurrency control


This is the process of managing simultaneous operations on the database
without having them interfere with one another. It prevents interference when
two or more users are accessing a database simultaneously and at least one is
updating data. Although two transactions may be correct in themselves,
interleaving of operations may produce an incorrect result (Connolly & Begg,
2015: Section 22.2; Barghouti & Kaiser, 1991).

2.5.1 Potential problems caused by concurrency


According to Connolly and Begg (2015: Section 22.2) and Barghouti and Kaiser
(1991), some potential problems caused by concurrency are:
 Lost update problem
 Uncommitted dependency problem
 Inconsistent analysis problem

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 48

2.5.1.1 The lost-update problem


A successfully completed update is overridden by another user.
 T1 withdrawing £10 from an account with balx, initially £100.
 T2 depositing £100 into same account.
 Serially, final balance would be £190.

Figure 26 – The lost-update problem pseudocode


Source: Connolly & Begg (2015:673)

 Loss of T2‟s update avoided by preventing T1 from reading balx until after
update.

2.5.1.2 The uncommitted dependency problem


This occurs when one transaction can see intermediate results of another
transaction before it has committed.
 T4 updates balx to £200 but it aborts, so balx should be back at the original
value of £100.
 T3 has read the new value of balx (£200) and uses the value as the basis of
£10 reduction, giving a new balance of £190, instead of £90.

Figure 27 – The uncommitted dependency problem pseudocode


Source: Connolly & Begg (2015:674)

 Problem avoided by preventing T3 from reading balx until after T4 commits


or aborts.

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 49

2.5.1.3 The inconsistent analysis problem


This occurs when the transaction reads several values but the second
transaction updates some of them during execution of the first. It is sometimes
referred to as a „dirty read‟ or „unrepeatable read‟.
 T6 is totalling balances of Account x (£100), Account y (£50), and Account
z (£25).
 In the meantime, T5 has transferred £10 from balx to balz, so T6 now has
the wrong result (£10 too high).

Figure 28 – The inconsistent analysis problem pseudocode


Source: Connolly & Begg (2015:674)

 Problem avoided by preventing T6 from reading balx and balz until after T5
has completed updates.

2.5.2 Concurrency control techniques


According to Connolly and Begg (2015:682), there are two basic concurrency
control techniques:
 Locking
 Timestamping

Both are conservative approaches, because they delay transactions in case


they conflict with other transactions. Optimistic methods assume that conflict
is rare and only check for conflicts at commit.
2.5.2.1 Locking
Here, a transaction uses locks to deny access to other transactions and so
prevent incorrect updates. It is the most widely used approach to ensure
serialisability (Connolly & Begg, 2015: Section 22.2.3).

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 50

Generally, a transaction must claim a shared (read) or exclusive (write) lock on


a data item before read or write. Lock prevents another transaction from
modifying an item or even reading it, in the case of a write lock (Connolly &
Begg, 2015: Section 22.2.3).

Connolly and Begg (2015: Section 22.2.3) outline some basic rules of locking,
as follows:
 If the transaction has a shared lock on an item, one can read but not
update the item.
 If the transaction has an exclusive lock on an item, one can both read and
update the item.
 Reads cannot conflict, so more than one transaction can hold shared locks
simultaneously on the same item.
 An exclusive lock gives a transaction exclusive access to that item.
 Some systems allow the transaction to upgrade read lock to an exclusive
lock, or downgrade exclusive lock to a shared lock.

Two-phase locking (2PL)


This occurs when the transaction follows 2PL protocol if all locking operations
precede the first unlocks operation in the transaction (Connolly & Begg, 2015:
Section 22.2.3).

Two phases for transaction, according to Connolly and Begg (2015: Section
22.2.3), are:
 Growing phase: Acquires all locks but cannot release any locks.
 Shrinking phase: Releases locks but cannot acquire any new locks.

Figure 29 – Preventing the lost update problem using 2PL pseudocode


Source: Connolly & Begg (2015:685)

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 51

Figure 30 – Preventing the uncommitted dependency problem using 2PL pseudocode


Source: Connolly & Begg (2015:686)

Figure 31 – Preventing the inconsistent analysis problem using 2PL pseudocode


Source: Connolly & Begg (2015:686)

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 52

Cascading rollback
If every transaction in a schedule follows 2PL, the schedule is serialisable.
However, problems can occur with the interpretation of when the locks can be
released (Connolly & Begg, 2015: Section 22.2.3).

Figure 32 – Cascading rollback using 2PL pseudocode


Source: Connolly & Begg (2015:687)

Transactions conform to 2PL. T14 aborts. Since T15 is dependent on T14, T15
must also be rolled back. Since T16 is dependent on T15, it, too, must be
rolled back. This is called cascading rollback. To prevent this with 2PL, leave
the release of all the locks until the end of the transaction (Connolly & Begg,
2015: Section 22.2.3).

Concurrency control with index structures


Concurrency control with index structures can be managed by treating each
page of the index as a data item and applying 2PL. However, as indexes will be
frequently accessed, particularly higher levels, this may lead to high lock
contention (Connolly & Begg, 2015: Section 22.2.3).

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 53

When a new index value (key and pointer) is being inserted into a leaf node,
then if the node is not full, insertion will not cause changes to higher-level
nodes (Connolly & Begg, 2015: Section 22.2.3).

This suggests that we only have to exclusively lock the leaf node in such a
case, and only exclusively lock higher-level nodes if the node is full and has to
be split (Connolly & Begg, 2015: Section 22.2.3).

Thus, Connolly and Begg (2015: Section 22.2.3) derive the following locking
strategy:
 For searches, obtain shared locks on nodes, starting at the root and
proceeding downwards along the required path. Release the lock on the
node once a lock has been obtained on the child node.
 For insertions, the conservative approach would be to obtain exclusive locks
on all nodes as we descend through tree to the leaf node to be modified.
 For a more optimistic approach, obtain shared locks on all nodes as we
descend to the leaf node to be modified, where we obtain an exclusive lock.
If the leaf node has to split, upgrade the shared lock on the parent to an
exclusive lock. If this node also has to split, continue to upgrade locks at
the next higher level.

Deadlock
This is an impasse that may result when two (or more) transactions are each
waiting for locks held by the other to be released (Connolly & Begg, 2015:
Section 22.2.4).

Figure 33 – Deadlock between two transactions pseudocode


Source: Connolly & Begg (2015:689)

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 54

There is only one way to break a deadlock; that is: abort one or more of the
transactions. A deadlock should be transparent to the user, so DBMS should
restart the transaction/s (Connolly & Begg, 2015: Section 22.2.4).

The three general techniques for handling deadlock, as stipulated by Connolly


and Begg (2015: Section 22.2.4), are:
 Timeouts
 Deadlock prevention
 Deadlock detection and recovery

 Timeouts
A transaction that requests a lock will only wait for a system-defined period
of time. If the lock has not been granted within this period, the lock request
times out. In this case, DBMS assumes that the transaction may be
deadlocked, even though it may not be, and it aborts and automatically
restarts the transaction (Connolly & Begg, 2015: Section 22.2.4).

 Deadlock prevention
According to Connolly and Begg (2015: Section 22.2.4), in this technique,
DBMS looks ahead to see if the transaction would cause deadlock, and
never allows a deadlock to occur. One could order transactions using
transaction timestamps:
o Wait-Die: only an older transaction can wait for a younger one;
otherwise the transaction is aborted (dies) and restarted with the same
time stamp.
o Wound-Wait: only a younger transaction can wait for an older one. If
the older transaction requests a lock held by a younger one, the
younger one is aborted (wounded).

 Deadlock detection and recovery:


Here, Connolly and Begg (2015: Section 22.2.4) emphasise that DBMS
allows a deadlock to occur, but recognises it and breaks it. Usually handled
by construction of a wait-for graph (WFG) showing transaction
dependencies:
o Create a node for each transaction.
o Create edge Ti -> Tj, if Ti is waiting to lock an item locked by Tj.

A deadlock exists if, and only if, the WFG contains a cycle. A WFG is created
at regular intervals.

The following are some issues to consider when recovering from deadlock
detection:
o The choice of a deadlock victim
o How far to roll a transaction back
o Avoiding starvation
(Connolly & Begg, 2015: Section 22.2.4)

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 55

2.5.3 Timestamping
Transactions are ordered globally so that older transactions and transactions
with smaller timestamps, receive priority in the event of conflict. Conflict is
resolved by rolling back and restarting the transaction (Connolly & Begg, 2015:
Section 22.2.5).

A timestamp is a unique identifier, created by DBMS, which indicates the


relative starting time of a transaction. It can be generated by using a system
clock at the time at which the transaction is started or by incrementing a
logical counter every time a new transaction starts (Connolly & Begg, 2015:
Section 22.2.5).

Connolly and Begg (2015: Section 22.2.5) state that read/write proceeds only
if the last update on that data item was carried out by an older transaction.
Otherwise, the transaction requesting read/write is restarted and given a new
timestamp. Also, timestamps for data items can be:
 Read-timestamp: timestamp of last transaction to read item
 Write-timestamp: timestamp of last transaction to write item

2.5.3.1 Timestamping: Read (x)


Consider a transaction T with timestamp ts(T) asks to read an item (x) that
has already been updated by a younger/later transaction:
 ts(T) < write_timestamp(x).
 x has already been updated by a younger (later) transaction.
 This means that an earlier transaction is trying to read a value of an item
that has been updated by a later transaction.
 The earlier transaction is too late to read the previous outdated value, and
any other values it has acquired are likely to be inconsistent with the
updated value of the data item.
 The transaction must be aborted and restarted with a new timestamp:
ts(T) < read_timestamp(x).
 x has already been read by a younger transaction.
 Roll back the transaction and restart it using a later timestamp.
(Connolly & Begg, 2015: Section 22.2.5)

2.5.3.2 Timestamping: Write (x)


Consider a transaction T with timestamp ts(T) asks to write an item (x) that
has already been read by a younger/later transaction:
 ts(T) < write_timestamp(x)
 x has already been written by a younger transaction.
 This means that a later transaction is already using the current value of the
item and it would be an error to update it now.
 This occurs when a transaction is late in doing a write and a younger
transaction has already read the old value or written a new one.
 In this case the system rolls back the transaction and restart it using a later
timestamp.

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 56

 The write can safely be ignored: ignore an obsolete write rule.


Otherwise, the operation is accepted and executed.
(Connolly & Begg, 2015: Section 22.2.5)

2.5.4 Multiversion timestamp ordering


Versioning of data can be used to increase concurrency. Basic timestamp-
ordering protocol assumes that only one version of a data item exists, and so
only one transaction can access the data item at a time. This can allow
multiple transactions to read and write different versions of the same data
item, and ensure that each transaction sees a consistent set of versions for all
data items that it accesses (Connolly & Begg, 2015: Section 22.2.6).

In multiversion concurrency control, each write operation creates a new


version of the data item while retaining the old version. When the transaction
attempts to read a data item, the system selects one version that ensures
serialisability. Versions can be deleted once they are no longer required
(Connolly & Begg, 2015: Section 22.2.6).

2.5.5 Optimistic techniques


These are based on the assumption that conflict is rare and that it is more
efficient to let transactions proceed without delays to ensure serialisability. At
commit, a check is made to determine whether a conflict has occurred. If there
is a conflict, the transaction must be rolled back and restarted. This potentially
allows greater concurrency than traditional protocols do. The three phases of
this technique are Read, Validation and Write (Connolly & Begg, 2015: Section
22.2.7).

2.5.5.1 Read phase


This phase extends from start until immediately before commit. The
transaction reads values from the database and stores them in local variables.
Updates are applied to a local copy of the data (Connolly & Begg: 2015,
Section 22.2.7).

2.5.5.2 Validation phase


The Validation phase follows the Read phase. For a read-only transaction, it
checks that the data read are still current values. If there is no interference,
the transaction is committed; otherwise it is aborted and restarted. For an
update transaction, it checks that the transaction leaves the database in a
consistent state, with serialisability maintained (Connolly & Begg, 2015:
Section 22.2.7).

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 57

2.5.5.3 Write phase


This phase follows a successful validation phase for update transactions.
Updates made to local copy are applied to the database (Connolly & Begg,
2015: Section 22.2.7).

2.6 Database recovery


According to Connolly and Begg (2015: Section 22.3), database recovery is the
process of restoring the database to its correct state in the event of a failure.

2.6.1 The need for recovery


There are two types of storage: volatile (main memory) and non-volatile.
Volatile storage does not survive system crashes. Stable storage represents
information that has been replicated in several non-volatile storage media with
independent failure modes (Connolly & Begg, 2015: Section 22.3.1; Bernstein,
Hadzilacos & Goodman, 1987; Bernstein, Hadzilacos & Goodman, 1988).

Connolly and Begg (2015: Section 22.3.1) talk about the different types of
failure that can affect database processing. Among the causes of failure are:
 System crashes, resulting in loss of main memory
 Media failures, resulting in loss of parts of secondary storage
 Application software errors
 Natural physical disasters
 Carelessness or unintentional destruction of data or facilities
 Sabotage

2.6.2 Transaction and recovery


Transactions represent the basic unit of recovery. The recovery manager is
responsible for atomicity and durability. If failure occurs between commit and
database buffers being flushed to secondary storage, then, to ensure
durability, the recovery manager has to redo (rollforward) the transaction‟s
updates (Connolly & Begg, 2015: Section 22.3.2).

If the transaction had not committed at failure time, the recovery manager has
to undo (rollback) any effects of that transaction for atomicity. Partial undo
takes place when only one transaction has to be undone. Global undo takes
place when all transactions have to be undone (Connolly & Begg, 2015:
Section 22.3.2).

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 58

Example

Figure 34 – Example of UNDO/REDO


Source: Connolly & Begg (2015:703)

DBMS starts at time t0, but fails at time tf. Assume that the data for
transactions T2 and T3 has been written to secondary storage. T1 and T6 have
to be undone. In the absence of any other information, the recovery manager
has to redo T2, T3, T4, and T5.

2.6.3 Recovery facilities


Connolly and Begg (2015: Section 22.3.3) state that DBMS should provide the
following facilities to assist with recovery:
 A backup mechanism, which makes periodic backup copies of the database.
 Logging facilities, which keep track of the current state of transactions and
database changes
 A checkpoint facility, which enables updates to the database in progress to
be made permanent
 A recovery manager, which allows DBMS to restore the database to a
consistent state following a failure

2.6.3.1 Log file


The log file contains information about all updates to the database: transaction
records and checkpoint records. It is often used for other purposes (for
example, auditing) (Connolly & Begg, 2015: Section 22.3.3).

According to Connolly and Begg (2015: Section 22.3.3), transaction records


contain:
 A transaction identifier
 The type of log record (transaction start, insert, update, delete, abort,
commit)

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 59

 An identifier of the data item affected by the database action (insert, delete
and update operations)
 A before-image of the data item
 An after-image of the data item
 Log management information

The log file may be duplexed or triplexed. The log file is sometimes split into
two, separate random-access files. This might cause potential bottleneck
issues, and is critical in determining overall performance (Connolly & Begg,
2015: Section 22.3.3).

MySQL creates several log file that store different information with regards to
the activities occurring in the database. These log files have to be activated
and cleared frequently as they consume much space. Below are the different
types of MySQL log files.

Table 1 - Log Types


Log Type Information Written to Log
Error log Problems encountered starting, running, or stopping mysqld
General query log Established client connections and statements received from clients
Binary log Statements that change data (also used for replication)
Relay log Data changes received from a replication master server
Slow query log Queries that took more than long_query_time seconds to execute
Source: https://dev.mysql.com/doc/refman/5.0/en/server-logs.html
[Accessed: 03/05/2015]

The above log files can be activated in MySQL Workbench in the Management
screen.

Figure 35 – Log File settings


Source: Nyasha Magutsa

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 60

Useful website
For more details access the following link:
https://dev.mysql.com/doc/refman/5.0/en/server-logs.html

2.6.3.2 Checkpointing
It is the point of synchronisation between the database and log file. All buffers
are force-written to secondary storage. A checkpoint record is created,
containing identifiers of all active transactions. When failure occurs, redo all
transactions that committed since the checkpoint and undo all transactions
active at the time of the crash (Connolly & Begg, 2015: Section 22.3.3).

In the previous example, with checkpoint at time tc, changes made by T2 and
T3 have been written to secondary storage. Thus, only redo T4 and T5, and
undo transactions T1 and T6 (Connolly & Begg, 2010).

MySQL implements checkpointing using InnoDB storage engine. It is a high-


reliability and high-performance storage engine for MySQL.

Figure 36 – InnoDB options


Source: Nyasha Magutsa

Key advantages of InnoDB:


 Its design follows the ACID model, with transactions featuring commit,
rollback, and crash-recovery capabilities to protect user data.
 Row-level locking (without escalation to coarser granularity locks) and
Oracle-style consistent reads increase multi-user concurrency and
performance.
 InnoDB tables arrange your data on disk to optimise common queries
based on primary keys. Each InnoDB table has a primary key index called

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 61

the clustered index that organises the data to minimise I/O for primary key
lookups.
 To maintain data integrity, InnoDB also supports FOREIGN KEY referential-
integrity constraints.
 You can freely mix InnoDB tables with tables from other MySQL storage
engines, even within the same statement. For example, you can use a join
operation to combine data from InnoDB and MEMORY tables in a single
query.
 InnoDB has been designed for CPU efficiency and maximum performance
when processing large data volumes.
Source: http://dev.mysql.com/doc/refman/5.0/en/innodb-storage-engine.html
[Accessed: 03/05/2015]

Useful website
For additional resources on MySQL Checkpointing, access the
following links.
https://dev.mysql.com/doc/refman/5.5/en/innodb-
checkpoints.html
https://dev.mysql.com/doc/refman/5.5/en/innodb-storage-
engine.html

2.6.4 Recovery techniques


If the database has been damaged, there is a need to restore the last backup
copy of the database and reapply updates of committed transactions using the
log file. If the database is only inconsistent, there is a need to undo the
changes that caused the inconsistency. One may also need to redo some
transactions to ensure that the updates reach the secondary storage. There is
no need to back up, but one can restore the database using before and after-
images in the log file (Connolly & Begg, 2015: Section 22.3.4).

There are three main recovery techniques:


 Deferred update
 Immediate update
 Shadow paging

2.6.4.1 Deferred update


Updates are not written to the database until after a transaction has reached
its commit point. If a transaction fails before commit, it will not have modified
the database and so no undoing of changes is required. It may be necessary to
redo updates of committed transactions, as their effect may not have reached
the database (Connolly & Begg, 2015: Section 22.3.4).

2.6.4.2 Immediate update


Updates are applied to the database as they occur. There is the need to redo
updates of committed transactions following a failure. You may need to undo
effects of transactions that had not committed at the time of failure. It is

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 62

essential that log records are written before writing to the database, using the
Write-ahead log protocol. If there is no „transaction commit‟ record in the log,
then that transaction was active at failure and must be undone. Undo
operations are performed in reverse order to the order in which they were
written to the log (Connolly & Begg, 2015: Section 22.3.4).

2.6.4.3 Shadow paging


This scheme maintains two page tables during the life of a transaction: a
current page and a shadow page table. When the transaction starts, the two
pages are the same. The shadow page table is never changed thereafter and is
used to restore the database in the event of failure. During the transaction, the
current page table records all updates to the database. When the transaction
completes, the current page table becomes a shadow page table (Connolly &
Begg, 2015: Section 22.3.4).

Concluding remarks
In this unit, we examined concurrency control, database recovery, protocols
that can prevent conflicts in databases and the techniques used to ensure that
a database remains consistent in the event of failures.

Also the scope of database security as well as the different types of computer
threats was discussed. In addition to that, various security measures
associated with DBMS and the Web were discussed.

2.7 Self-assessment

Test your knowledge

1. Explain the purpose and scope of database security.


2. List the main types of threats that could affect a database system. Also
describe the controls that you would use to counteract each of them.
3. Describe the security measures provided by MySQL DBMS.
4. Describe the approaches for securing DBMSs on the Web.
5. Explain the following in terms of providing security for a database:
5.1 Authorisation
5.2 Access controls
5.3 Views
5.4 Backup and recovery
5.5 Integrity
5.6 Encryption
5.7 RAID technology

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 63

6. Describe, with examples, the types of problems that can occur in a multi-
user environment when concurrent access to the database is allowed.
7. What is a timestamp? How do timestamp-based protocols for concurrency
control differ from locking-based protocols?
8. Discuss the difference between pessimistic and optimistic concurrency
control.
9. On the MySQL database you have created, enable the expire_logs_days in
on the database‟s log file and set the value to 7 days.
10. Checkpointing in MySQL is implemented using the InnoDB. On the MySQL
database you have created, change the InnoDB log file size to 32M.
11. Review and research the following article:
https://dev.mysql.com/doc/refman/5.5/en/storage-engines.html.

Recommend a storage engine you can use in your project database.

CTI Education Group ©


Unit 2: Database security, concurrency control and recovery Page 64

CTI Education Group ©


Unit 3: Integrate a database with an application Page 65

Unit 3: Integrate a database with an application


Unit 3 is aligned with the following learning outcomes and
assessment criteria:

Learning outcomes:
LO3: Integrate a database with a software application or website.

Assessment criteria:
AC3.1: Produce an optimised logical and physical design for a
database of advanced complexity.
AC3.2: Develop and build the relational database.
AC3.3: Design, develop and build a third part application that
interfaces with the database.

Learning objectives
After studying this unit, you should be able to:
 Design a logical and physical database design
 Develop and build a relational database
 Build and integrate a database to an application

Introduction
This unit considers the database design and integration to an application or
website. The application or website is assumed to be developed in a language
of your own choice. Note that the application or website design and
development is not covered in this module and is to be developed using
expertise gained from other languages taught in other modules. Also research
is emphasised for application or website design and development if no prior
experience is available.

In this unit, emphases is on the design and development methodology of a


database. The student is to use this methodology to output a relational
database. The developed relational database is to be integrated to a third party
application or website. There several application software languages that can
integrate to MySQL and MS Visual Studio has been used as an example in this
unit. Students are expected to research and demonstrate how other languages
of their own choice can integrate with MySQL.

CTI Education Group ©


Unit 3: Integrate a database with an application Page 66

In this unit you will learn:


 That database design is composed of three main phases: conceptual,
logical and physical database design
 The steps involved in the main phases of the database design methodology.
 How to integrate your MySQL database to an application?

In this guide, we present a database design methodology for relational


databases. This methodology is made up of three main phases: conceptual,
logical and physical database design. In this unit, we summarise the steps
involved in these phases for those readers who are already familiar with
database design.

3.1 Step 1: Build conceptual data model


The first step in conceptual database design is to build a conceptual data
models of the data requirements of the enterprise. A conceptual data model
comprises: entity types, relationship types, attributes and attribute domains,
primary keys and alternate keys, integrity constraints. The conceptual data
model is supported by documentation, including a data dictionary, which is
produced throughout the development of the model. We detail the types of
supporting documentation that may be produced as we go through the various
tasks that form this step (Connolly & Begg, 2015: Appendix D).

Step 1.1: Identify entity types


The first step in building a local conceptual data model is to define the main
objects that the users are interested in. One method of identifying entities is to
examine the users‟ requirements specification. From this specification we
identify nouns or noun phrases that are mentioned. We also look for major
objects such as people, places or concepts of interest, excluding those nouns
that are merely qualities of other objects. Document entity types (Connolly &
Begg, 2015: Appendix D).

Step 1.2: Identify relationship types


Identify the important relationships that exist between the entity types that
have been identified. Use Entity–Relationship (ER) modelling to visualise the
entity and relationships. Determine the multiplicity constraints of relationship
types. Check for fan and chasm traps. Document relationship types (Connolly
& Begg, 2015: Appendix D).

Step 1.3: Identify and associate attributes with entity or


relationship types
Associate attributes with the appropriate entity or relationship types. Identify
simple/composite attributes, single-valued/multi-valued attributes and derived
attributes. Document attributes (Connolly & Begg, 2015: Appendix D).

CTI Education Group ©


Unit 3: Integrate a database with an application Page 67

Step 1.4: Determine attribute domains


Determine domains for the attributes in the conceptual model. Document
attribute domains (Connolly & Begg, 2015: Appendix D).

Step 1.5: Determine candidate, primary, and alternate key


attributes
Identify the candidate key/s for each entity and, if there is more than one
candidate key, choose one to be the primary key. Document primary and
alternate keys for each strong entity (Connolly & Begg, 2015: Appendix D).

Step 1.6: Consider use of enhanced modelling concepts (optional


step)
Consider the use of enhanced modelling concepts, such as
specialisation/generalisation, aggregation and composition (Connolly & Begg,
2010; Batini et al., 1992).

Step 1.7: Check model for redundancy


Check for the presence of any redundancy in the model. Specifically re-
examine one-to-one (1:1) relationships, remove redundant relationships, and
consider time dimension (Connolly & Begg, 2015: Appendix D).

Step 1.8: Validate conceptual model against user transactions


Ensure that the conceptual model supports the required transactions. Two
possible approaches are: describing the transactions and using transaction
pathways (Connolly & Begg, 2015: Appendix D).

Step 1.9: Review conceptual data model with user


Review the conceptual data model with the user to ensure that the model is a
„true‟ representation of the data requirements of the enterprise (Connolly &
Begg, 2015: Appendix D).

3.2 Step 2: Build and validate logical data model


Build a logical data model from the conceptual data model and then validate
this model to ensure it is structurally correct (using the technique of
normalisation) and to ensure it supports the required transactions (Connolly &
Begg, 2015: Appendix D).

Step 2.1: Derive relations for logical data model


Create relations from the conceptual data model to represent the entities,
relationships, and attributes that have been identified. Table 2 summarises
how to map entities, relationships and attributes to relations. Document

CTI Education Group ©


Unit 3: Integrate a database with an application Page 68

relations and foreign key attributes. Also, document any new primary or
alternate keys that have been formed as a result of the process of deriving
relations (Connolly & Begg, 2015: Appendix D).

Summary of how to map entities and relationships to relations

Table 2 – Summary of how to map entities and relationships to relations

Source: Connolly & Begg (2015:D-3)

CTI Education Group ©


Unit 3: Integrate a database with an application Page 69

Table 3 – Guidelines for the representation of a superclass/subclass relationship


based on the participation and disjoint constraints

Source: Connolly & Begg (2015:D-4)

Step 2.2: Validate relations using normalisation


Validate the relations in the logical data model using the technique of
normalisation. The objective of this step is to ensure that each relation is in at
least Third Normal Form (3NF) (Connolly & Begg, 2015: Appendix D).

Step 2.3: Validate relations against user transactions


Ensure that the relations in the logical data model support the required
transactions (Connolly & Begg, 2015: Appendix D).

Step 2.4: Check integrity constraints


Identify the integrity constraints, which include specifying the required data,
attribute domain constraints, multiplicity, entity integrity, referential integrity
and general constraints. Document all integrity constraints (Connolly & Begg,
2015: Appendix D).

Step 2.5: Review logical data model with user


Ensure that the users consider the logical data model to be a true
representation of the data requirements of the enterprise (Connolly & Begg,
2015: Appendix D).

Step 2.6: Merge logical data models into global model


The methodology for Step 2 is presented so that it is applicable for the design
of simple to complex database systems. For example, to create a database
with a single user view or multiple user views being managed using the

CTI Education Group ©


Unit 3: Integrate a database with an application Page 70

centralised approach then Step 2.6 is omitted. If, however, the database has
multiple user views that are being managed using the view integration
approach, then Steps 2.1 to 2.5 are repeated for the required number of data
models, each of which represents different user views of the database system.
In Step 2.6 these data models are merged. Typical tasks associated with the
process of merging are as follows:
1. Review the names and contents of entities/relations and their candidate
keys.
2. Review the names and contents of relationships/foreign keys.
3. Merge entities/relations from the local data models.
4. Include (without merging) entities/relations unique to each local data
model.
5. Merge relationships/foreign keys from the local data models.
6. Include (without merging) relationships/foreign keys unique to each local
data model.
7. Check for missing entities/relations and relationships/foreign keys.
8. Check foreign keys.
9. Check integrity constraints.
10. Draw the global ER/relation diagram.
11. Update the documentation. Validate the relations created from the global
logical data model using the technique of normalisation and ensure that
they support the required transactions, if necessary.
(Connolly & Begg, 2015: Appendix D)

Step 2.7: Check for future growth


Determine whether there are any significant changes likely in the near future,
and assess whether the logical data model can accommodate these changes
(Connolly & Begg, 2015: Appendix D).

3.3 Step 3: Translate logical data model for target DBMS


Produce a relational database schema that can be implemented in the target
DBMS from the logical data model (Connolly & Begg, 2015: Appendix D).

Step 3.1: Design base relations


Decide how to represent the base relations that have been identified in the
logical data model in the target DBMS. Document design of base relations
(Connolly & Begg, 2015: Appendix D).

Step 3.2: Design representations of derived data


Decide how to represent any derived data present in the logical data model in
the target DBMS. Document design of derived data (Connolly & Begg, 2015:
Appendix D).

CTI Education Group ©


Unit 3: Integrate a database with an application Page 71

Step 3.3: Design general constraints


Design the general constraints for the target DBMS. Document design of
general constraints (Connolly & Begg, 2015: Appendix D).

3.4 Step 4: Design file organisations and indexes


Determine the optimal file organisations to store the base relations and the
indexes that are required to achieve acceptable performance; that is, the way
in which relations and tuples will be held on secondary storage (Connolly &
Begg, 2015: Appendix D).

Step 4.1: Analyse transactions


Understand the functionality of the transactions that will run on the database
and analyse the important transactions (Connolly & Begg, 2015: Appendix D).

Step 4.2: Choose file organisations


Determine an efficient file organisation for each base relation (Connolly &
Begg, 2015: Appendix D).

Step 4.3: Choose indexes


Determine whether adding indexes will improve the performance of the system
(Connolly & Begg, 2015: Appendix D).

Step 4.4: Estimate disk space requirements


Estimate the amount of disk space that will be required by the database
(Connolly & Begg, 2015: Appendix D).

3.5 Step 5: Design user views


Design the user views that were identified during the requirements collection
and analysis stage of the relational database system development life cycle.
Document design of user views (Connolly & Begg, 2015: Appendix D).

3.6 Step 6: Design security mechanisms


Design the security measures for the database system as specified by the
users. Document design of security measures (Connolly & Begg, 2015:
Appendix D).

CTI Education Group ©


Unit 3: Integrate a database with an application Page 72

3.7 Step 7: Consider the introduction of controlled


redundancy
Determine whether introducing redundancy in a controlled manner by relaxing
the normalisation rules will improve the performance of the system. For
example, consider duplicating attributes or joining relations together.
Document introduction of redundancy (Connolly & Begg, 2015: Appendix D).

3.8 Step 8: Monitor and tune the operational system


Monitor the operational system and improve the performance of the system to
correct inappropriate design decisions or reflect changing requirements
(Connolly & Begg, 2015: Appendix D).

3.9 Step 9: Build and integrate an application/website


There are many applications that can be integrated with the MySQL database.
The integration process tends to be application dependent. The application
references the MySQL database‟s connection parameters for connectivity.
Below is an example of how a MS Visual Studio application integrates and
connects to a MySQL database.

Figure 37 – Creating Database connection


Source: Nyasha Magutsa

CTI Education Group ©


Unit 3: Integrate a database with an application Page 73

Figure 38 – Selecting Data Source


Source: Nyasha Magutsa

Figure 39 – Selecting MySQL database


Source: Nyasha Magutsa

CTI Education Group ©


Unit 3: Integrate a database with an application Page 74

Figure 40 – Enter connection details and database selection


Source: Nyasha Magutsa

Figure 41 – Test Connection


Source: Nyasha Magutsa

CTI Education Group ©


Unit 3: Integrate a database with an application Page 75

Figure 42 – Confirm connection


Source: Nyasha Magutsa

Figure 43 – Navigating connected data source/database


Source: Nyasha Magutsa

Concluding remarks
The database design process is composed of three main phases: conceptual,
logical and physical database design. After a database is built, it can be
integrated with application system.

CTI Education Group ©


Unit 3: Integrate a database with an application Page 76

3.10 Self-assessment

Test your knowledge

1. Research the different integration options that can be used to link a


database to an application. (e.g. integrating a MySQL database to a Java
application).

CTI Education Group ©


Glossary Page 77

Glossary
Access An example of a database development tool, produced by Microsoft
Add SQL keyword: command to add items to a composite object
An inference rule for functional dependencies. Also called the Union
Additive rule Rule. If X Y and X Z then X YZ. Contrast with the projective
rule
An SQL query that returns some function of a collection of rows, rather
Aggregate query
than the rows themselves. See count, avg, min, max, sum
Alter SQL keyword: command to change an object specification
SQL keyword: used as a Boolean operator to construct the where clause
And
of a query
Anomaly An inconsistency in a database
SQL keyword. Used with nested queries to compare an
Any attribute/attributes with the rows returned by the nested query,
returning true if at least one row satisfies the comparison
SQL keyword. An optional keyword used to specify aliases for attributes
As
or tables in a query
Atomic data types The most simple forms of data, Boolean, string, integer, and so forth
The property of a transaction that guarantees that either all or none of
Atomicity
the changes made by the transaction are written to the database
The name of a column of a table, indicating the meaning of the data in
Attribute
that column
Authorisation SQL keyword used to specify the owner of a schema
SQL keyword. Used to perform an aggregate query that returns the
Avg
average value of an attribute
A tree-based structure used for storing data in a database, with extra
B+ tree
links to facilitate sequential access to data
A relation stored in a database, with data that is not calculated from
Base relation
data in other relations
Candidate key A synonym for key, in a relation schema with more than one key
Cardinality The number of elements of the relation (that is, rows in the table)
Set of all ordered pairs (or triples, etc.) of elements from two (or three,
etc.) sets. A natural join of two relations with no common attributes,
Cartesian Product
whose tuples are all possible combinations of the tuples of the original
relations
SQL keyword. Indicates behaviour of the RDBMS when an object is
Cascade modified (deleted or changed) when there are other objects dependent
on it
SQL keyword. Used to indicate that a „check option‟ added to a view
Cascaded
should also apply to the views used to define it (if any)
Char SQL keyword. Synonym for character
Character SQL keyword: Domain of textual data
SQL keyword. Used in the creation of tables to specify complex
Check
conditions the data must satisfy
SQL keyword. Used to ensure that modifications made to a view result
Check option
in rows that still belong to the view
SQL keyword: used in add, alter or drop to denote a column (attribute)
Column
of a table
The action that causes the all of the changes made by a particular
Commit
transaction to be reliably written to the database files
Composite The „whole‟ in a whole-part relationship (aggregation)
Any data model in which data is described independently of the logical
Conceptual data
model used to organise the data, instead relating the data to real-world
models
concept
Concurrency The capacity of a system to handle many users simultaneously

CTI Education Group ©


Glossary Page 78

The property of a transaction that guarantees that the state of the


database both before and after execution of the transaction remains
Consistency
consistent (i.e. free of any data integrity errors) whether or not the
transaction commits or is rolled back
SQL keyword. Used in add, alter or drop to denote a constraint on a
Constraint
table
A formula or algorithm for estimating the „cost‟ of a particular form of a
Cost model
query. Usually taking into account expected memory usage and time
Count SQL keyword. Used in an aggregate query to count the rows returned
Create SQL keyword. Used to create an object in SQL
A mechanism in SQL for allowing a set of tuples to be manipulated one
Cursor
by one
Data Definition Used to define the logical, external and physical schemas and access
Language rights
Data dictionary A collection of information stored in the DBMS about what objects exist
The property of a DBMS that allows users and programs to refer to data
Data independence at a level of abstraction that ignores the actual implementation of the
DB
Data Manipulation
Used to query and update data to the database
Language
Data model A combination of constructs used to organise data
Basic data type such as text, number, date, time, Boolean, enumerated
Data type
type, and so on
A collection of data, used to represent information of interest to an
Database
information system
Database
Responsible for the design, control and administration of a DB
administrator
Date SQL keyword. Domain of date values
Day SQL keyword. Used to specify interval attributes
A software system able to manage large, shared, persistent collections
DBMS
of data while ensuring reliability and privacy
A situation in which resources (i.e. locks) are held by two or more
Deadlock connections that are each needed by the other connections so that they
are stuck in an infinite wait loop
Decimal SQL keyword. Synonym for numeric
Declarativeness The existence in a system of a high-level query language
Declare SQL keyword. Used to declare cursors
A collection D = {R1, R2, …, Rk} of relation schemas that together
Decomposition
contain all the attributes of a larger relation schema R
SQL keyword. Used to specify the default value of an attribute or
Default
domain
SQL keyword. Used with „set constraints‟ to specify that a constraint
Deferred
should only be checked after a full transaction is completed
Degree The number of terms in the Cartesian Product
SQL keyword. Denotes the privilege of being able to delete rows from a
Delete
table or view
Delete From SQL keyword. Used to remove rows from a table
The process of transforming a database schema into one satisfying only
Denormalisation a lower normal form, usually by storing joins of tables directly instead
of as views, for performance reasons
A dependency between two package exists if one package references
Dependency
elements of the other
An attribute of a class which may be derived from other attributes or
Derived attribute from the nature of associations between objects of the class and other
objects
A relation which is calculated from other relations in the database. See
Derived relation
also base relation, materialised view, virtual relation, view

CTI Education Group ©


Glossary Page 79

A type of security access control that grants or restricts object access


Discretionary access
via an access policy determined by an object's owner group and/or
control (DAC)
subjects
SQL keyword. Used in an aggregate query to summarise distinct non-
Distinct
null values of an attribute
A database in which its data is distributed among multiple computers or
Distributed database devices (nodes) allowing multiple computers to simultaneously access
data residing on separate nodes
Domain SQL keyword specifying that an operation acts on a domain object
Domain constraint
A form of tuple constraint which specifies allowable values of particular
(also Value
attribute (e.g. Mark must be between 0 to 100)
constraint)
Domain (of a One of the sets used to form the Cartesian Product of which the relation
relation) is a subset. That is, the type of data that appears in a column of a table
Drop SQL keyword. Command to delete an object
The property of a transaction in which the DBMS guarantees that all
Durability
committed transactions will survive any kind of system failure
The encoding of data so that it cannot be understood by a human
Encryption
reader
Entity-Relationship
An example of a conceptual data model
model
A Theta join where the tuples of the Cartesian Product are selected
Equijoin
according to a number of equalities between attributes
Two sets E and F of functional dependencies are equivalent if their
Equivalent
closures are equal
Execute SQL keyword. Used to execute a previously prepared SQL command
SQL keyword. Used with nested queries. Exists (Query) returns true
Exists
if Query returns at least one row
Float SQL keyword. Domain of floating point values
For SQL keyword. Part of the syntax for „cursor‟ declaration
Foreign key SQL keyword. Used to define a referential constraint in SQL
Foreign key An (poor choice) attribute which is a piece of text or similar basic data
attribute type which actually refers to a complex object
A constraint ensuring, for a set of attributes A of a relation r1, and a
Foreign key corresponding set of attributes B of r2, and is a key (the primary key?)
constraint for r2, that for every tuple t1 of r1, there exists a tuple t2 of r2 for
which t1[A] = t2[B]
SQL keyword. Used to construct queries. Specifies what tables the
From
attributes are selected from
Function SQL-3 keyword used to manipulate (create, etc.) functions
Grant SQL keyword. Used to give a privilege on a resource to a user
SQL keyword. Used to modify an aggregate query to partition the rows
Group by according to the values of given attributes before doing the calculations
required by the aggregate query
A „join method‟ where a hash function is used to identify matching
Hash join
tuples in the two tables
SQL keyword. Similar to „where‟, it specifies a Boolean condition that
Having must be satisfied by the rows finally returned from an aggregate query
with a „group by‟ clause
SQL keyword. Used with „set constraints‟ to indicate that a given
Immediate constraint should be immediately checked after every step of a
transaction, not merely when the whole transaction is completed

CTI Education Group ©


Glossary Page 80

The fact that an SQL query returns whole blocks of data, but high-level
Impedance general-purpose languages generally can only handle single items of
mismatch data one at a time – and the problem of using the two approaches
together.
SQL keyword. Used with nested queries. Attribute in (Query) is
In
equivalent to Attr = any (Query)
A structure (usually a tree structure) allowing quick access to data
Index
stored in the database via a key
SQL keyword. Used to create or drop an index for a database. Syntax:
Index „create [unique] index IndexName onTableName(AttributeList)‟ or „drop
index IndexName‟
Inner join SQL keyword: used to join two tables before selecting from them
SQL keyword. Denotes the privilege of being able to add data to a table
Insert
or view
Insert into SQL keyword. Used to add data to a table
Integer SQL keyword. Domain of integers
A property that must be satisfied by all correct database instances. See
Integrity constraint
Predicate, intra-relational constraint, inter-relational constraint
SQL keyword. Used to find the intersection of the output of two select
Intersect
statements (queries)
The intersection of two relations r1 and r2 is the set of tuples belonging
Intersection
to both r1 and r2 (contrast with union or difference)
Interval SQL keyword, domain of time intervals
SQL keyword. Part of the syntax of „fetch‟. SQL keyword. Part of the
Into
syntax of „execute‟
SQL keyword: used in the where clause of a query: „is null‟ or „is not
Is
null‟
The property of a transaction that guarantees that the changes made by
Isolation a transaction are isolated from the rest of the system until after the
transaction has committed
Join SQL keyword: used to join two tables before selecting from them
A minimal superkey. That is, a set of attributes A on a relation r, such
that there is no two distinct tuples t1 and t2 of r with t1[A] = t2[A]…
Key
and A does not contain any proper subset for which this statement still
holds
An (intra-relational) integrity constraint ensuring that a selected set of
Key constraint
attributes forms a (super)key
An outer join r1 LEFT r2 where dangling tuples from r1 are padded
Left outer join
with blanks and inserted into the join
A method for safely protecting data from being changed by two or more
Locking
users (processes/threads) at the same time
Any data model where a particular method of organisation is used to
Logical data model
organise data
A description of a database according to the appropriate logical data
Logical Schema
model
Metadata Data about the structure of data
Microsoft Access An example of a database development tool, produced by Microsoft
SQL keyword. Used to perform an aggregate query that returns the
Min
minimum value of an attribute
Characteristic of a role. It indicates how many objects of one type fulfil
Multiplicity
the role for the object at the other end of the association
An operator combining tuples of two relations r1 and r2 on sets of
Natural join
attributes X1 and X2
A „join method‟ where the attributes of one table are looped through
Nested loop
once for each tuple in the other
A select statement (SQL query) used as part of the with clause of
Nested query
another query, and used as a source of data against which to compare

CTI Education Group ©


Glossary Page 81

attributes
The process of changing a database design to comply with the various
Normalisation
normal forms
Normalisation An algorithm for taking an unnormalised relation and putting it into a
algorithm higher normal form
SQL keyword. Used with nested queries. Not exists (Query) returns true
Not exists
if Query returns no rows at all
SQL keyword. Used with nested queries. Attr not in (Query) is
Not in
equivalent to Attr <> all (Query)
Not null SQL keyword. Constraint that the given attribute may not be null
Null SQL keyword indicating a null value
A special value a tuple can assume on an attribute, denoting an absence
Null value
of information
SQL keyword: domain of exact numbers, either integral or with a given
Numeric
number of decimal places
A computing programming paradigm which defines the computing
Object-oriented problem to be solved as a set of objects which are members of various
object classes each with its own set of data manipulation methods
Order by SQL keyword: Used to sort the output of a query
A natural join augmented with tuples derived from tuples of r1 or r2 for
Outer join
which no matching tuple in the other relation exists
Used to group together classes which are similar or related in some
Package
way, to ease the software development process
Persistent The lifespan of a database extends beyond that of the program using it
A memory-saving technique of performing several operations tuple by
Pipelining
tuple, and so not storing intermediate tables
PL/SQL An extension of SQL marketed by Oracle
A function associating a value True or False with an instance of a
Predicate
database
A key (in the second sense) that is constrained to not contain null
Primary key
values
Privilege A „permission‟ to do something on some component of a database
SQL keyword. Used to define a procedure. Under standard SQL, a
Procedure procedure may only contain a single SQL statement. Many DBMSs relax
this restriction
An operator that takes a relation and returns a new relation whose
Projection
attributes are a subset of the original
A function mapping instances of a given database schema into relations
Query
on a given set of attributes
Query Language A language in which queries may be expressed
Redundant Array of Inexpensive Disks. Hard-drive system with multiple
RAID
disks allowing for various levels of reliability and recoverability
References SQL keyword: specifying a referential constraint
Referential A constraint ensuring, for a set of attributes A of a relation r1, and a
constraint (also: corresponding set of attributes B of r2, and is a key (the primary key?)
foreign key for r2, that for every tuple t1 of r1, there exists a tuple t2 of r2 for
constraint) which t1[A] = t2[B]
Relation A subset of a Cartesian Product
Relation instance A relation, in the second sense
Relational data
A data model using tables (relations) to organise data
model
The name of the relation R, and a set X of names of the attributes.
Relation schema
Normally denoted R(X)
SQL keyword. Used in a „fetch‟ statement to move a given number of
Relative
rows forwards or backwards in the query
Revoke SQL keyword. Used to remove privileges from users
Right outer join An outer join r1 RIGHT r2 where dangling tuples from r2 are padded

CTI Education Group ©


Glossary Page 82

with blanks and inserted into the join


Schema (of a Its heading (or name), followed by (in brackets) the names of its
relation or table) attributes
SQL keyword. Used to construct a query. Specifies what attributes to
Select
select
The semantics of a schema gives its meaning, that is, how the tables
Semantics
and attributes correspond to real-world things
Set SQL keyword. Used with „update‟ to modify the data in a table
SQL keyword. Used to specify whether a given constraint should be
Set constraints checked every time an operation is performed on the database, or only
at the end of a „transaction‟
SQL keyword. Indicates behaviour of the RDBMS when an object is
Set default modified (deleted or changed) when there are other objects dependent
on it
SQL keyword. Indicates behaviour of the RDBMS when an object is
Set null modified (deleted or changed) when there are other objects dependent
on it
A standard language, the „structured query language‟, incorporating
SQL
DDL and DML features, used to manipulate databases
An exception thrown by the executeXXX methods of Statement if there
SQL exception
are problems with the statement
Subquery A query inside a query
Subselect Another term for Subquery
SQL keyword. Used to perform an aggregate query that adds up the
Sum
values of an attribute
Table SQL keyword specifying that an operation acts on a table object
Theta-join The selection of a Cartesian Product
Time SQL keyword. Domain of time values
Timestamp SQL keyword. Domain of date+time values
A set of logically related database modifications that are written to the
Transaction
database as a unit
A sequential record of all of the database changes made by each
Transaction log
transaction in the order they were issued
This is when the DDBMS hides all the added complexities of distribution,
Transparency allowing users to think that they are working with a single centralised
system
A function from a set of attributes to a collection of elements from the
Tuple
domains of the attributes
A form of intra-relational integrity constraint which may be evaluated
Tuple constraint
individually on single tuples of the relation
Two phase commit The process by which a relational database ensures that distributed
(2PC) transactions are performed in an orderly manner
Union (of two The union of two relations r1 and r2 is the set of tuples that belong to
relations) either r1 or r2
SQL keyword. A constraint that the given attribute/s must take on
Unique
unique values in the table. Cf. primary key
Update SQL keyword. Used with „set‟ to modify data in a table
SQL keyword. Used with „insert into‟ to specify the data values to be
Values
added to the table
Varchar SQL keyword. Short for „character varying‟
View Usually, a virtual relation
Also called view. A derived relation not physically stored in the
Virtual relation
database, but named and usable in queries as if it were
SQL keyword. Used to construct queries. Specifies conditions to be
Where
satisfied on the attributes returned
With SQL keyword. Used to add a „check option‟ to a „view‟
With grant option SQL keyword. Used to indicate that the user receiving the privilege may

CTI Education Group ©


Glossary Page 83

also pass it on to others


Year SQL keyword. Used to help specify interval attributes

CTI Education Group ©


Glossary Page 84

CTI Education Group ©


Bibliography Page 85

Bibliography
Barghouti, N.S. & Kaiser, G. 1991. Concurrency control in advanced database
applications. New York: ACM Computing Survey.

Batini, C.; Ceri, S. & Navathe, S. 1992. Conceptual database design: an


entity–relationship approach. Redwood City, CA: Benjamin/Cummings.

Bernstein, P.A.; Hadzilacos, V. & Goodman, N. 1987. Concurrency control and


recovery in database systems. Reading, MA: Addison-Wesley.

Bernstein, P.A.; Hadzilacos, V. & Goodman, N. 1988. Concurrency control and


recovery in database systems. Reading, MA: Addison Wesley.

Castano, S.; Fugini, M.; Martella, G. & Samarati, P. 1995. Database security.
Reading, Mass.: Addison-Wesley.

Chen, P.M. & Patterson, D.A. 1990. Maximising performance in a striped disk
array. In: Proceedings of 17th annual international symposium on computer
architecture.

Chen, P.M.; Lee, E.K.; Gibson, G.A.; Katz, R.H. & Patterson, D.A. 1994. RAID:
high-performance, reliable secondary storage. ACM computing surveys, 26(2).

Chen, P.P. 1976. The Entity–Relationship model – toward a unified view of


data. ACM transactions on database systems, 1(1):9–36.

Connolly, T. & Begg, C. 2010. Database systems: a practical approach to


design, implementation and management. 5th edition. Harlow: Addison-
Wesley.

Connolly, T. & Begg, C. 2015. Database systems: a practical approach to


design, implementation and management. 6th edition Global Edition. Harlow:
Pearson Education Limited. ISBN: 9781292061184.

Connolly, T.; Begg, C. & Holowczak, R. 2008. Business database systems.


Harlow: Addison-Wesley.

http://blog.winhost.com

http://dev.mysql.com/doc/refman/5.5/en/index.html

http://tpc.org

http://www.abanet.org/scitech/ec/isc/dsg-tutorial.html

http://www.computerprivacy.org/who/

CTI Education Group ©


Bibliography Page 86

http://www.cve.mitre.org

http://www.mysqltutorial.org/

https://www.beastnode.com

CTI Education Group ©


CTI is part of Pearson, the world‟s leading learning company. Pearson is the corporate owner, not a registered
provider nor conferrer of qualifications in South Africa. CTI Education Group (Pty) Ltd. is registered with
the Department of Higher Education and Training as a private higher education institution under the
Higher Education Act, 101, of 1997. Registration Certificate number: 2004/HE07/004. www.cti.ac.za.

CTI Education Group ©

You might also like