0% found this document useful (0 votes)
154 views370 pages

DS PPT

This document discusses various data structures and their characteristics. It defines a data structure as a way of organizing data that considers the logical relationships between data elements. Data structures can be primitive, like integers and floats, or non-primitive, like lists, stacks, queues, trees and graphs built from primitive structures. Key aspects like implementation, common operations, and examples are provided for several important non-primitive data structures.

Uploaded by

sumalraj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
154 views370 pages

DS PPT

This document discusses various data structures and their characteristics. It defines a data structure as a way of organizing data that considers the logical relationships between data elements. Data structures can be primitive, like integers and floats, or non-primitive, like lists, stacks, queues, trees and graphs built from primitive structures. Key aspects like implementation, common operations, and examples are provided for several important non-primitive data structures.

Uploaded by

sumalraj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 370

 Data structure is representation of the logical

relationship existing between individual elements of


data.
 In other words, a data structure is a way of organizing
all data items that considers not only the elements
stored but also their relationship to each other.
 Data structure affects the design of both structural &
functional aspects of a program.
Program=algorithm + Data Structure
 You know that a algorithm is a step by step procedure
to solve a particular function.
 That means, algorithm is a set of instruction written
to carry out certain tasks & the data structure is the
way of organizing the data with their logical
relationship retained.
 To develop a program of an algorithm, we should
select an appropriate data structure for that algorithm.
 Therefore algorithm and its associated data structures
from a program.
 Data structure are normally divided into two broad
categories:
◦ Primitive Data Structure
◦ Non-Primitive Data Structure
Data structure

Primitive DS Non-Primitive DS

Integer Float Character Pointer


Non-Primitive DS

Linear List Non-Linear List

Array Queue Graph Trees

Link List Stack


 There are basic structures and directly operated upon
by the machine instructions.
 In general, there are different representation on
different computers.
 Integer, Floating-point number, Character constants,
string constants, pointers etc, fall in this category.
 There are more sophisticated data structures.
 These are derived from the primitive data structures.
 The non-primitive data structures emphasize on
structuring of a group of homogeneous (same type) or
heterogeneous (different type) data items.
 Lists, Stack, Queue, Tree, Graph are example of non-
primitive data structures.
 The design of an efficient data structure must take
operations to be performed on the data structure.
 The most commonly used operation on data
structure are broadly categorized into following
types:
◦ Create
◦ Selection
◦ Updating
◦ Searching
◦ Sorting
◦ Merging
◦ Destroy or Delete
 A primitive data structure is generally a basic structure
that is usually built into the language, such as an
integer, a float.
 A non-primitive data structure is built out of primitive
data structures linked together in meaningful ways,
such as a or a linked-list, binary search tree, AVL Tree,
graph etc.
 An array is defined as a set of finite number of
homogeneous elements or same data items.
 It means an array can contain one type of data only,
either all integer, all float-point number or all
character.
 Simply, declaration of array is as follows:
int arr[10]
 Where int specifies the data type or type of elements
arrays stores.
 “arr” is the name of array & the number specified
inside the square brackets is the number of elements
an array can store, this is also called sized or length
of array.
 Following are some of the concepts to be remembered
about arrays:
◦ The individual element of an array can be
accessed by specifying name of the array,
following by index or subscript inside square
brackets.
◦ The first element of the array has index
zero[0]. It means the first element and last
element will be specified as:arr[0] & arr[9]
Respectively.
◦ The elements of array will always be stored
in the consecutive (continues) memory
location.
◦ The number of elements that can be stored
in an array, that is the size of array or its
length is given by the following equation:
(Upperbound-lowerbound)+1
◦ For the above array it would be
(9-0)+1=10,where 0 is the lower bound of
array and 9 is the upper bound of array.
◦ Array can always be read or written through
loop. If we read a one-dimensional array it
require one loop for reading and other for
writing the array.
◦ For example: Reading an array
For(i=0;i<=9;i++)
scanf(“%d”,&arr[i]);
◦ For example: Writing an array
For(i=0;i<=9;i++)
printf(“%d”,arr[i]);
◦ If we are reading or writing two-dimensional
array it would require two loops. And
similarly the array of a N dimension would
required N loops.
◦ Some common operation performed on array
are:
 Creation of an array
 Traversing an array
◦ Insertion of new element
◦ Deletion of required element
◦ Modification of an element
◦ Merging of arrays
 A lists (Linear linked list) can be defined as a
collection of variable number of data items.
 Lists are the most commonly used non-primitive data
structures.
 An element of list must contain at least two fields, one
for storing data or information and other for storing
address of next element.
 As you know for storing address we have a special data
structure of list the address must be pointer type.
 Technically each such element is referred to as a node,
therefore a list can be defined as a collection of nodes
as show bellow:

[Linear Liked List]


Head

AAA BBB CCC

Information field Pointer field


 Types of linked lists:
◦ Single linked list
◦ Doubly linked list
◦ Single circular linked list
◦ Doubly circular linked list
 A stack is also an ordered collection of elements like
arrays, but it has a special feature that deletion and
insertion of elements can be done only from one end
called the top of the stack (TOP)
 Due to this property it is also called as last in first out
type of data structure (LIFO).
 It could be through of just like a stack of plates placed on
table in a party, a guest always takes off a fresh plate
from the top and the new plates are placed on to the stack
at the top.
 It is a non-primitive data structure.
 When an element is inserted into a stack or removed
from the stack, its base remains fixed where the top of
stack changes.
 Insertion of element into stack is called PUSH and
deletion of element from stack is called POP.
 The bellow show figure how the operations take place
on a stack:

PUSH POP

[STACK]
 The stack can be implemented into two ways:
◦ Using arrays (Static implementation)
◦ Using pointer (Dynamic implementation)
 Queue are first in first out type of data structure (i.e.
FIFO)
 In a queue new elements are added to the queue from
one end called REAR end and the element are always
removed from other end called the FRONT end.
 The people standing in a railway reservation row are
an example of queue.
 Each new person comes and stands at the end of
the row and person getting their reservation
confirmed get out of the row from the front end.
 The bellow show figure how the operations take
place on a stack:

10 20 30 40 50

front rear
 The queue can be implemented into two ways:
◦ Using arrays (Static implementation)
◦ Using pointer (Dynamic implementation)
 A tree can be defined as finite set of data items (nodes).
 Tree is non-linear type of data structure in which data
items are arranged or stored in a sorted sequence.
 Tree represent the hierarchical relationship between
various elements.
 In trees:
 There is a special data item at the top of hierarchy
called the Root of the tree.
 The remaining data items are partitioned into number
of mutually exclusive subset, each of which is itself,
a tree which is called the sub tree.
 The tree always grows in length towards bottom in
data structures, unlike natural trees which grows
upwards.
 The tree structure organizes the data into branches,
which related the information.

A root

B C

D E F G
 Graph is a mathematical non-linear data structure
capable of representing many kind of physical
structures.
 It has found application in Geography, Chemistry and
Engineering sciences.
 Definition: A graph G(V,E) is a set of vertices V and a
set of edges E.
 An edge connects a pair of vertices and many have
weight such as length, cost and another measuring
instrument for according the graph.
 Vertices on the graph are shown as point or circles and
edges are drawn as arcs or line segment.
 Example of graph:

6
v2 v5
v1 v3
10

v1 8 11
15
9 v2
v3 v4 v4

[a] Directed & [b] Undirected Graph


Weighted Graph
 Types of Graphs:
◦ Directed graph
◦ Undirected graph
◦ Simple graph
◦ Weighted graph
◦ Connected graph
◦ Non-connected graph
 The array as an abstract data type
 Structures and Unions
 The polynomial Abstract Data Type
 The Sparse Matrix Abstract Data Type
 The Representation of Multidimensional
Arrays
 Arrays
◦ Array: a set of pairs, <index, value>
◦ data structure
 For each index, there is a value associated with that
index.
◦ representation (possible)
 Implemented by using consecutive memory.
 In mathematical terms, we call this a correspondence
or a mapping.
 When considering an ADT we are more
concerned with the operations that can be
performed on an array.
◦ Aside from creating a new array, most languages
provide only two standard operations for arrays,
one that retrieves a value, and a second that
stores a value.
◦ Structure 2.1 shows a definition of the array ADT
◦ The advantage of this ADT definition is that it
clearly points out the fact that the array is a
more general structure than “a consecutive set of
 Arrays in C
◦ int list[5], *plist[5];
◦ list[5]: (five integers) list[0], list[1], list[2], list[3],
list[4]
◦ *plist[5]: (five pointers to integers)
 plist[0], plist[1], plist[2], plist[3], plist[4]
◦ implementation of 1-D array
list[0] base address = 
list[1]  + sizeof(int)
list[2]  + 2*sizeof(int)
list[3]  + 3*sizeof(int)
list[4]  + 4*sizeof(int)
 Arrays in C (cont’d)
◦ Compare int *list1 and int list2[5] in C.
Same: list1 and list2 are pointers.
Difference: list2 reserves five locations.
◦ Notations:
list2 - a pointer to list2[0]
(list2 + i) - a pointer to list2[i] (&list2[i])
*(list2 + i) - list2[i]
Address Contents
1228 0
1230 1
 Example: 1232 2
1-dimension array addressing 1234 3
◦ int one[] = {0, 1, 2, 3, 4}; 1236 4
◦ Goal: print out address and value
 void print1(int *ptr, int rows){
/* print out a one-dimensional array using a pointer */
int i;
printf(“Address Contents\n”);
for (i=0; i < rows; i++)
printf(“%8u%5d\n”, ptr+i, *(ptr+i));
printf(“\n”);
}
 2.2.1 Structures (records)
◦ Arrays are collections of data of the same type. In
C there is an alternate way of grouping data that
permit the data to vary in type.
 This mechanism is called the struct, short for structure.
◦ A structure is a collection of data items, where
each item is identified as to its type and name.
 Create structure data type
◦ We can create our own structure data types by
using the typedef statement as below:

 This says that human_being is the name of the type


defined by the structure definition, and we may follow
this definition with declarations of variables such as:
human_being person1, person2;
◦ We can also embed a structure within a structure.

 A person born on February 11, 1994, would have have


values for the date struct set as
◦ A union declaration is similar to a structure.
◦ The fields of a union must share their memory space.
◦ Only one field of the union is “active” at any given time
 Example: Add fields for male and female.

person1.sex_info.sex = male;
person1.sex_info.u.beard = FALSE;
and
person2.sex_info.sex = female;
person2.sex_info.u.children = 4;
 2.2.3 Internal implementation of structures
◦ The fields of a structure in memory will be stored
in the same way using increasing address
locations in the order specified in the structure
definition.
◦ Holes or padding may actually occur
 Within a structure to permit two consecutive components
to be properly aligned within memory
◦ The size of an object of a struct or union type is
the amount of storage necessary to represent the
largest component, including any padding that
may be required.
 2.2.4 Self-Referential Structures
◦ One or more of its components is a pointer to
itself.
Construct a list with three nodes
◦ typedef struct list {
char data; item1.link=&item2;
list *link; item2.link=&item3;
} malloc: obtain a node (memory)
free: release memory
◦ list item1, item2, item3;
item1.data=‘a’;
item2.data=‘b’;
item3.data=‘c’; a b c
item1.link=item2.link=item3.link=NULL;
 Ordered or Linear List Examples
◦ ordered (linear) list: (item1, item2, item3, …,
itemn)
 (Sunday, Monday, Tuesday, Wednesday, Thursday, Friday,
Saturday)
 (Ace, 2, 3, 4, 5, 6, 7, 8, 9, 10, Jack, Queen, King)
 (basement, lobby, mezzanine, first, second)
 (1941, 1942, 1943, 1944, 1945)
 (a1, a2, a3, …, an-1, an)
 Operations on Ordered List
◦ Finding the length, n , of the list.
◦ Reading the items from left to right (or right to
left).
◦ Retrieving the i’th element.
◦ Storing a new value into the i’th position.
◦ Inserting a new element at the position i , causing
elements numbered i, i+1, …, n to become
numbered i+1, i+2, …, n+1
◦ Deleting the element at position i , causing
elements numbered i+1, …, n to become
numbered i, i+1, …, n-1
 Implementation
◦ sequential mapping (1)~(4)
◦ non-sequential mapping (5)~(6)
 Polynomial examples:
◦ Two example polynomials are:
 A(x) = 3x20+2x5+4 and B(x) = x4+10x3+3x2+1
◦ Assume that we have two polynomials,
A(x) = aixi and B(x) = bixi where x is the variable,
ai is the coefficient, and i is the exponent, then:
 A(x) + B(x) = (ai + bi)xi
 A(x) · B(x) = (aixi · (bjxj))
 Similarly, we can define subtraction and division on
polynomials, as well as many other operations.
 An ADT definition
of a polynomial
 There are two ways to create the type
polynomial in C
 Representation I
◦ define MAX_degree 101 /*MAX degree of polynomial+1*/
typedef struct{
int degree;
float coef [MAX_degree];
drawback: the first
}polynomial;
representation may
waste space.
Polynomial Addition

◦ /* d =a + b, where a, b, and d are polynomials */


d = Zero( )
while (! IsZero(a) && ! IsZero(b)) do {
switch COMPARE (Lead_Exp(a), Lead_Exp(b)) {
case -1: d =
Attach(d, Coef (b, Lead_Exp(b)), Lead_Exp(b));
b = Remove(b, Lead_Exp(b));
break;
case 0: sum = Coef (a, Lead_Exp (a)) + Coef ( b, Lead_Exp(b));
if (sum) {
Attach (d, sum, Lead_Exp(a));
}
a = Remove(a , Lead_Exp(a));
b = Remove(b , Lead_Exp(b));
break;
case 1: d =
Attach(d, Coef (a, Lead_Exp(a)), Lead_Exp(a));
a = Remove(a, Lead_Exp(a));
} advantage: easy implementation
} disadvantage: waste space when sparse
insert any remaining terms of a or b into d

*Program 2.4 :Initial version of padd function(p.62)


 Representation II
◦ MAX_TERMS 100 /*size of terms array*/
typedef struct{
float coef;
int expon;
}polynomial;
polynomial terms [MAX_TERMS];
int avail = 0;
 Use one global array to store all polynomials
◦ Figure 2.2 shows how these polynomials are stored in
the array terms. specification representation
A(x) = 2x1000+1 poly <start, finish>
A <0,1>
B(x) = x4+10x3+3x2+1 B <2,5>
storage requirements: start, finish, 2*(finish-start+1)
non-sparse: twice as much as Representation I when all the items are nonzero
 We would now like to
write a C function
that adds two
polynomials,
A and B, represented
as above to obtain D
= A + B.
◦ To produce D(x), padd
(Program 2.5) adds A(x) and
B(x) term by term.

Analysis: O(n+m)
where n (m) is the number
of nonzeros in A (B).
Problem: Compaction is required
when polynomials that are no longer needed.
(data movement takes time.)
 2.4.1 Introduction
◦ In mathematics, a matrix contains m rows and n
columns of elements, we write mn to designate a
matrix with m rows and n columns.

sparse matrix
data structure?
5*3
15/15 8/36 6*6
 The standard representation of a matrix is a
two dimensional array defined as
a[MAX_ROWS][MAX_COLS].
◦ We can locate quickly any element by writing a[i ][ j ]
 Sparse matrix wastes space
◦ We must consider alternate forms of representation.
◦ Our representation of sparse matrices should store
only nonzero elements.
◦ Each element is characterized by <row, col, value>.
 Structure 2.3
contains our
specification of
the matrix ADT.
◦ A minimal set of
operations
includes matrix
creation,
addition,
multiplication,
and transpose.
 We implement the Create operation as below:
 Figure 2.4(a) shows how the sparse matrix of
Figure 2.3(b) is represented in the array a.
◦ Represented by a two-dimensional array.
◦ Each element is characterized by <row, col, value>.
# of rows (columns)
# of nonzero terms

transpose

row, column in
ascending order
 2.4.2 Transpose a Matrix
◦ For each row i
 take element <i, j, value> and store it in element <j, i, value>
of the transpose.
 difficulty: where to put <j, i, value>
(0, 0, 15) ====> (0, 0, 15)
(0, 3, 22) ====> (3, 0, 22)
(0, 5, -15) ====> (5, 0, -15)
(1, 1, 11) ====> (1, 1, 11)
Move elements down very often.
◦ For all elements in column j,
place element <i, j, value> in element <j, i, value>
 This algorithm is incorporated in transpose
(Program 2.7).

columns
elements

Scan the array


“columns” times.
==> O(columns*elements)
The array has
“elements” elements.
 Discussion: compared with 2-D array
representation
◦ O(columns*elements) vs. O(columns*rows)
◦ elements --> columns * rows when non-sparse,
O(columns2*rows)
 Problem: Scan the array “columns” times.
◦ In fact, we can transpose a matrix represented as a
sequence of triples in O(columns + elements) time.
 Solution:
◦ First, determine the number of elements
in each column of the original matrix.
◦ Second, determine the starting positions of each
row
in the transpose matrix.
 Compared with 2-D array representation:
O(columns+elements) vs. O(columns*rows)
elements --> columns * rows O(columns*rows)

Cost:
Additional
row_terms and
starting_pos arrays
columns
are required.
elements
Let the two arrays
row_terms andcolumns
starting_pos be
shared. elements
 After the execution of the third for loop, the
values of row_terms and starting_pos are:
[0] [1] [2] [3] [4] [5]
row_terms = 2 1 2 2 0 1
starting_pos = 1 3 4 6 8 8

transpose
 2.4.3 Matrix multiplication
◦ Definition:
Given A and B where A is mn and B is np, the
product matrix D has dimension mp. Its <i, j>
element is
n 1

for 0  i < m and 


d ij  0 ajik<
k 0
bkjp.
◦ Example:
 Sparse Matrix Multiplication
◦ Definition: [D]m*p=[A]m*n* [B]n*p
◦ Procedure: Fix a row of A and find all elements
in column j of B for j=0, 1, …, p-1.
◦ Alternative 1.
Scan all of B to find all elements in j.
◦ Alternative 2.
Compute the transpose of B.
(Put all column elements consecutively)
 Once we have located the elements of row i of A and
column j of B we just do a merge operation similar to that
used in the polynomial addition of 2.2
 General case:
dij=ai0*b0j+ai1*b1j+…+ai(n-1)*b(n-1)j
◦ Array A is grouped by i, and after transpose,
array B is also grouped by j

a Sa d Sd
b Sb e Se
c Sc f Sf
g Sg

The generation at most:


entries ad, ae, af, ag, bd, be, bf, bg, cd, ce, cf, cg
An Example
A = 1 0 2 BT = 3 -1 0 B = 3 0 2
-1 4 6 0 0 0 -1 0 0
2 0 5 0 0 5

a[0] ro2 c3 val5 bt[0]ro3 c 3 val4 b[0]ro3 c 3 val4


[1] w0 ol0 ue1 bt[1]w0 ol0 ue3 b[1]w0 ol0 ue3
[2] 0 2 2 bt[2] 0 1 -1 b[2] 0 2 2
[3] 1 0 -1 bt[3] 2 0 2 b[3] 1 0 -1
[4] 1 1 4 bt[4] 2 2 5 b[4] 2 2 5
[5] 1 2 6
 The programs 2.9 and 2.10 can obtain the product
matrix D which multiplies matrices A and B.

a×b
 Analyzing the algorithm
◦ cols_b * termsrow1 + totalb +
cols_b * termsrow2 + totalb +
…+
cols_b * termsrowp + totalb
= cols_b * (termsrow1 + termsrow2 + … +
termsrowp)+
rows_a * totalb
= cols_b * totala + row_a * totalb

O(cols_b * totala + rows_a * totalb)


 Compared with matrix multiplication using
array
◦ for (i =0; i < rows_a; i++)
for (j=0; j < cols_b; j++) {
sum =0;
for (k=0; k < cols_a; k++)
sum += (a[i][k] *b[k][j]);
d[i][j] =sum;
}
◦ O(rows_a * cols_a * cols_b) vs.
O(cols_b * total_a + rows_a * total_b)
◦ optimal case:
total_a < rows_a * cols_a total_b < cols_a *
cols_b
◦ worse case:
total_a --> rows_a * cols_a, or
total_b --> cols_a * cols_b
 The internal representation of
multidimensional arrays requires more
complex addressing formula.
◦ If an array is declared a[upper0][upper1]…[uppern],
then it is easy to see that the number of
elements in then 1 array is:

 upper
i 0
i

Where  is the product of the upperi’s.


◦ Example:
 If we declare a as a[10][10][10], then we require
10*10*10 = 1000 units of storage to hold the array.
 Represent multidimensional arrays:
row major order and column major order.
◦ Row major order stores multidimensional arrays by
rows.
 A[upper0][upper1] as
upper0 rows, row0, row1, …, rowupper0-1,
each row containing upper1 elements.
 Row major order: A[i][j] :  + i*upper1 + j
 Column major order: A[i][j] :  + j*upper0
+i

col0 col1 colu1-1


row0 A[0][0] A[0][1] ... A[0][u1-1]
  + u0 +(u1-
1)* u0
row1 A[1][0] A[1][1] ... A[1][u1-1]
 + u1
. . .
rowu0-1 A[u0-1][0] A[u0-1][1] . . . A[u0-1][u1-1]
+(u0-1)*u1
 To represent a three-dimensional array,
A[upper0][upper1][upper2], we interpret the
array as upper0 two-dimensional arrays of
dimension upper1upper2.
◦ To locate a[i][j][k], we first obtain  +
i*upper1*upper2 as the address of a[i][0][0]
because there are i two dimensional arrays of
size upper1*upper2 preceding this element.
◦  + i*upper1*upper2+j *upper2+k
as the address of a[i][j][k].
 Generalizing on the preceding discussion, we can
obtain the addressing formula for any element
A[i0][i1]…[in-1] in an n-dimensional array declared
as: A[upper0][upper1]…[uppern-1]
◦ The address for A[i0][i1]…[in-1] is:
2.6.1 Introduction
 The String: component elements are
characters.
◦ A string to have the form, S = s0, …, sn-1, where
si are characters taken from the character set of
the programming language.
◦ If n = 0, then S is an empty or null string.
◦ Operations in ADT 2.4, p. 81
 ADT String:
 In C, we represent strings as character
arrays terminated with the null character \0.

 Figure 2.8 shows how these strings would


be represented internally in memory.
 Now suppose we want to concatenate these
strings together to produce the new string:
◦ Two strings are joined together by strcat(s, t), which stores
the result in s. Although s has increased in length by five,
we have no additional space in s to store the extra five
characters. Our compiler handled this problem inelegantly:
it simply overwrote the memory to fit in the extra five
characters. Since we declared t immediately after s, this
meant that part of the word “house” disappeared.
 C string
functions
 Example 2.2[String insertion]:
◦ Assume that we have two strings, say string 1
and string 2, and that we want to insert string 2
into string 1 starting at the i th position of string
1. We begin with the declarations:
◦ In addition to creating the two strings, we also
have created a pointer for each string.
 Now suppose that the first string contains
“amobile” and the second contains “uto”.
◦ we want to insert “uto”
starting at position 1 of
the first string, thereby
producing the word
“automobile.’
 String insertion function:
◦ It should never be used in practice as it is
wasteful in its use of time and space.
 2.6.2 Pattern Matching:
◦ Assume that we have two strings, string and pat where pat
is a pattern to be searched for in string.
◦ If we have the following declarations:

◦ Then we use the following statements to determine if pat is


in string:

◦ If pat is not in string, this method has a computing time of


O(n*m) where n is the length of pat and m is the length of
string.
 We can improve on an exhaustive pattern
matching technique by quitting when
strlen(pat) is greater than the number of
remaining characters in the string.
 Example 2.3 [Simulation of nfind]
◦ Suppose pat=“aab”
and
string=“ababbaabaa.”
◦ Analysis of nfind:
The computing time for
these string is linear
in the length of the
string O(m), but the
Worst case is still
O(n.m).
 Ideally, we would like an algorithm that
works in
O(strlen(string)+strlen(pat)) time.This is
optimal for this problem as in the worst
case it is necessary to look at all characters
in the pattern and string at least once.
 Knuth,Morris, and Pratt have developed a
pattern matching algorithm that works in
this way and has linear complexity.
 Suppose pat = “a b c a b c a c a b”
 From the definition of the failure function, we arrive at
the following rule for pattern matching: if a partial match
is found such that Si-j…Si-1=P0P1…Pj-1 and Si != Pj
then matching may be resumed by comparing Si and Pf(j-
1)+1 if j != 0 .If j= 0, then we may continue by
comparing Si+1 and P0.
 This pattern matching rule translates into
function pmatch.
 Analysis of pmatch:
◦ The while loop is iterated until the end of either the
string or the pattern is reached. Since i is never
decreased, the lines that increase i cannot be executed
more than m = strlen(string) times. The resetting of j to
failure[j-1]+1 decreases j++ as otherwise, j falls off the
pattern. Each time the statement j++ is executed, i is
also incremented. So j cannot be incremented more
than m times. Hence the complexity of function pmatch
is O(m) = O(strlen(string)).
◦ If we can compute the failure function in
O(strlen(pat)) time, then the entire pattern
matching process will have a computing time
proportional to the sum of the lengths of the
string and pattern. Fortunately, there is a fast
way to compute the failure function. This is based
upon the following restatement of the failure
function:
 Abstract Data Type as a design tool
 Concerns only on the important concept or
model
 No concern on implementation details.
 Stack & Queue is an example of ADT
 An array is not ADT.
 Stack & Queue vs. Array
◦ Arrays are data storage structures while stacks and
queues are specialized DS and used as
programmer’s tools.
 Stack – a container that allows push and pop
 Queue - a container that allows enqueue and
dequeue
 No concern on implementation details.
 In an array any item can be accessed, while in
these data structures access is restricted.
 They are more abstract than arrays.
 Array is not ADT
 Is Linked list ADT?
 Is Binary-tree ADT?
 Is Hash table ADT?
 What about graph?
 Allows access to only the last item inserted.
 An item is inserted or removed from the stack
from one end called the “top” of the stack.
 This mechanism is called Last-In-First-Out
(LIFO).

A Stack Applet example


 Placing a data item on the top is called
“pushing”, while removing an item from the
top is called “popping” it.
 push and pop are the primary stack
operations.
 Some of the applications are :
microprocessors, some older calculators etc.
 First example stack ADT and implementation
C:\Documents and Settings\box\My
Documents\CS\CSC\220\ReaderPrograms\ReaderFiles\Chap04\Stack\stack.ja
va

 push and pop operations are performed in


O(1) time.
 Reversed word
 What is it?
 ABC -> CBA
C:\Documents and Settings\box\My
Documents\CS\CSC\220\ReaderPrograms\ReaderFi
les\Chap04\Reverse\reverse.java
 BracketChecker (balancer)
 A syntax checker (compiler) that understands
a language containing any strings with
balanced brackets ‘{‘ ‘[‘ ‘(‘ and ‘)’, ‘]’, ‘}’
◦ S -> Bl S1 Br
◦ S1 -> Bl string Br
◦ Bl -> ‘{‘ | ‘[‘ | ‘(‘
◦ Br -> ‘)’, | ‘]’, | ‘}’
C:\Documents and Settings\box\My
Documents\CS\CSC\220\ReaderPrograms\ReaderFi
les\Chap04\Brackets\brackets.java
 Queue is an ADT data structure similar to stack,
except that the first item to be inserted is the first
one to be removed.
 This mechanism is called First-In-First-Out (FIFO).
 Placing an item in a queue is called “insertion or
enqueue”, which is done at the end of the queue
called “rear”.
 Removing an item from a queue is called “deletion
or dequeue”, which is done at the other end of the
queue called “front”.
 Some of the applications are : printer queue,
keystroke queue, etc.
 When a new item is inserted at the rear, the
pointer to rear moves upwards.
 Similarly, when an item is deleted from the
queue the front arrow moves downwards.
 After a few insert and delete operations the
rear might reach the end of the queue and no
more items can be inserted although the
items from the front of the queue have been
deleted and there is space in the queue.
 To solve this problem, queues implement
wrapping around. Such queues are called
Circular Queues.
 Both the front and the rear pointers wrap
around to the beginning of the array.
 It is also called as “Ring buffer”.
 Items can inserted and deleted from a queue
in O(1) time.
Queue
-maxSize : int
-queueArray [] : long
-front : int
QueueApp -rear : int
-nItems : int
Interface1 +Queue()
+insert() : void
+remove() : long
+peekFront() : long
+isEmpty() : bool
+isFull() : bool
+size() : int
 C:\Documents and Settings\box\My
Documents\CS\CSC\220\ReaderPrograms\Re
aderFiles\Chap04\Queue\queue.java
 Normal queue (FIFO)
 Circular Queue (Normal Queue)
 Double-ended Queue (Deque)
 Priority Queue
 It is a double-ended queue.
 Items can be inserted and deleted from either
ends.
 More versatile data structure than stack or
queue.
 E.g. policy-based application (e.g. low priority
go to the end, high go to the front)
 In a case where you want to sort the queue
once in a while, What sorting algorithm will
you use?
 More specialized data structure.
 Similar to Queue, having front and rear.
 Items are removed from the front.
 Items are ordered by key value so that the
item with the lowest key (or highest) is always
at the front.
 Items are inserted in proper position to
maintain the order.
 Let’s discuss complexity
PrioityQ
-maxSize : int
-queueArray [] : long
PriorityQApp
-nItems : int
Interface1 +Queue()
+insert() : void
+remove() : long
+peekMin() : long
+isEmpty() : bool
+isFull() : bool
 Used in multitasking operating system.
 They are generally represented using “heap”
data structure.
 Insertion runs in O(n) time, deletion in O(1)
time.

 C:\Documents and Settings\box\My


Documents\CS\CSC\220\ReaderPrograms\Re
aderFiles\Chap04\PriorityQ\priorityQ.java
 2+3 • 23+
 2+4*5 • 245*+
 ((2 + 4) * 7) + 3* (9 – 5)) • 2 4 + 7 * 3 9 5 - * +

 Infix vs postfix
 Why do we want to do this
transformation?
 Read ch from input until empty
◦ If ch is arg , output = output + arg
◦ If ch is “(“, push ‘(‘;
◦ If ch is op and higher than top push ch
◦ If ch is “)” or end of input,
 output = output + pop() until empty or top is “(“
◦ Read next input
 C:\Documents and Settings\box\My
Documents\CS\CSC\220\ReaderPrograms\Re
aderFiles\Chap04\Postfix\postfix.java
 5 + 2 * 3 -> 5 2 3 * +
 Algorithm
◦ While input is not empty
◦ If ch is number , push (ch)
◦ Else
 Pop (a)
 Pop(b)
 Eval (ch, a, b)
 C:\Documents and Settings\box\My
Documents\CS\CSC\220\ReaderPrograms\Re
aderFiles\Chap04\Postfix\postfix.java
 Recursion is:
◦ A problem-solving approach, that can ...
◦ Generate simple solutions to ...
◦ Certain kinds of problems that ...
◦ Would be difficult to solve in other ways
 Recursion splits a problem:
◦ Into one or more simpler versions of itself

Chapter 7: Recursion
23
Strategy for searching a sorted array:
1. if the array is empty
2. return -1 as the search result (not
present)
3. else if the middle element == target
4. return subscript of the middle
element
5. else if target < middle element
6. recursively search elements before
middle
7. else
8. recursively search elements after the
Chapter 7: Recursion
24
1. if problem is “small enough”
2. solve it directly
3. else
4. break into one or more smaller
subproblems
5. solve each subproblem recursively
6. combine results into solution to whole
problem

Chapter 7: Recursion
25
 At least one “small” case that you can solve
directly
 A way of breaking a larger problem down into:
◦ One or more smaller subproblems
◦ Each of the same kind as the original
 A way of combining subproblem results into an
overall solution to the larger problem

Chapter 7: Recursion
26
 Identify the base case(s) (for direct solution)
 Devise a problem splitting strategy
◦ Subproblems must be smaller
◦ Subproblems must work towards a base case
 Devise a solution combining strategy

Chapter 7: Recursion
27
Recursive algorithm for finding length of a string:
1. if string is empty (no characters)
2. return 0  base case
3. else  recursive case
4. compute length of string without first character
5. return 1 + that length

Note: Not best technique for this problem; illustrates the


approach.

Chapter 7: Recursion
28
Recursive algorithm for finding length of a string:
public static int length (String str) {
if (str == null ||
str.equals(“”))
return 0;
else
return length(str.substring(1)) + 1;
}

Chapter 7: Recursion
29
Recursive algorithm for printing a string:
public static void printChars
(String str) {
if (str == null ||
str.equals(“”))
return;
else
System.out.println(str.charAt(0));
printChars(str.substring(1));
}

Chapter 7: Recursion
30
Recursive algorithm for printing a string?
public static void printChars2
(String str) {
if (str == null ||
str.equals(“”))
return;
else
printChars2(str.substring(1));
System.out.println(str.charAt(0));
}

Chapter 7: Recursion
31
What does this do?
public static int mystery (int n) {
if (n == 0)
return 0;
else
return n + mystery(n-1);
}

Chapter 7: Recursion
32
Recall Proof by Induction:
1. Prove the theorem for the base case(s): n=0
2. Show that:
 If the theorem is assumed true for n,
 Then it must be true for n+1

Result: Theorem true for all n ≥ 0.

Chapter 7: Recursion
33
Recursive proof is similar to induction:
1. Show base case recognized and solved correctly
2. Show that
 If all smaller problems are solved correctly,
 Then original problem is also solved
correctly

3. Show that each recursive case makes progress towards


the base case  terminates properly

Chapter 7: Recursion
34
Overall
result
length(“ace”)

3
return 1 + length(“ce”)

2
return 1 + length(“e”)

1
return 1 + length(“”)
0
Chapter 7: Recursion
35
 Mathematicians often use recursive definitions
 These lead very naturally to recursive
algorithms
 Examples include:
◦ Factorial
◦ Powers
◦ Greatest common divisor

Chapter 7: Recursion
36
 0! = 1
 n! = n x (n-1)!

 If a recursive function never reaches its base case,


a stack overflow error occurs

Chapter 7: Recursion
37
public static int factorial (int n) {
if (n == 0) // or: throw exc. if < 0
return 1;
else
return n * factorial(n-1);
}

Chapter 7: Recursion
38
 x0 = 1
 xn = x  xn-1

public static double power


(double x, int n) {
if (n <= 0) // or: throw exc. if < 0
return 1;
else
return x * power(x, n-1);
}

Chapter 7: Recursion
39
Definition of gcd(m, n), for integers m > n > 0:
 gcd(m, n) = n, if n divides m evenly
 gcd(m, n) = gcd(n, m % n), otherwise

public static int gcd (int m, int n) {


if (m < n)
return gcd(n, m);
else if (m % n == 0) // could check n>0
return n;
else
return gcd(n, m % n);
}

Chapter 7: Recursion
40
Definition of fibi, for integer i > 0:

 fib1 = 1

 fib2 = 1

 fibn = fibn-1 + fibn-2, for n > 2

Chapter 7: Recursion
41
public static int fib (int n) {
if (n <= 2)
return 1;
else
return fib(n-1) + fib(n-2);
}

This is straightforward, but an inefficient


recursion ...

Chapter 7: Recursion
42
# calls apparently
O(2n) – big!

Chapter 7: Recursion
43
public static int fibStart (int n) {
return fibo(1, 0, n);
}

private static int fibo (


int curr, int prev, int n) {
if (n <= 1)
return curr;
else
return fibo(curr+prev, curr, n-1);
}

Chapter 7: Recursion
44
Performance is O(n)

Chapter 7: Recursion
45
 Towers of Hanoi
 Counting grid squares in a blob
 Backtracking, as in maze search

Chapter 7: Recursion
46
Goal: Move entire tower to another peg
Rules:
1. You can move only the top disk from a peg.
2. You can only put a smaller on a larger disk
(or on an empty peg)

Chapter 7: Recursion
47
Chapter 7: Recursion
48
Chapter 7: Recursion
49
Chapter 7: Recursion
50
Chapter 7: Recursion
51
move(n, src, dst, tmp) =
if n == 1: move disk 1 from src to dst
otherwise:
move(n-1, src, tmp, dst)
move disk n from src to dst
move(n-1, tmp, dst, src)

Chapter 7: Recursion
52
public class TowersOfHanoi {
public static String showMoves(int n,
char src, char dst, char tmp) {
if (n == 1)
return “Move disk 1 from “ + src +
“ to “ + dst + “\n”;
else return
showMoves(n-1, src, tmp, dst) +
“Move disk “ + n + “ from “ + src +
“ to “ + dst + “\n” +
showMoves(n-1, tmp, dst, src);
}
}
Chapter 7: Recursion
53
How big will the string be for a tower of size n?
We’ll just count lines; call this L(n).
 For n = 1, one line: L(1) = 1
 For n > 1, one line plus twice L for next smaller
size:
L(n+1) = 2 x L(n) + 1

Solving this gives L(n) = 2n – 1 = O(2n)


So, don’t try this for very large n – you will do a
lot of string concatenation and garbage
collection, and then run out of heap space and
terminate. Chapter 7: Recursion
54
 Linked lists
◦ Abstract data type (ADT)
 Basic operations of linked lists
◦ Insert, find, delete, print, etc.
 Variations of linked lists
◦ Circular linked lists
◦ Doubly linked lists
A B C 

Head
 A linked list is a series of connected nodes
 Each node contains at least
◦ A piece of data (any type)
◦ Pointer to the next node in the list
 Head: pointer to the first node
 The last node points to NULL node

data pointer
 We use two classes: Node and List
 Declare Node class for the nodes
◦ data: double-type data in this example
◦ next: a pointer to the next node in the list

class Node {
public:
double data; // data
Node* next; // pointer to next
};
 Declare List, which contains
◦ head: a pointer to the first node in the list.
Since the list is empty initially, head is set to NULL
◦ Operations on List
class List {
public:
List(void) { head = NULL; } // constructor
~List(void); // destructor

bool IsEmpty() { return head == NULL; }


Node* InsertNode(int index, double x);
int FindNode(double x);
int DeleteNode(double x);
void DisplayList(void);
private:
Node* head;
};
 Operations of List
◦ IsEmpty: determine whether or not the list is empty
◦ InsertNode: insert a new node at a particular
position
◦ FindNode: find a node with a given value
◦ DeleteNode: delete a node with a given value
◦ DisplayList: print all the nodes in the list
 Node* InsertNode(int index, double x)
◦ Insert a node with data equal to x after the index’th
elements. (i.e., when index = 0, insert the node as the first element;
when index = 1, insert the node after the first element, and so on)
◦ If the insertion is successful, return the inserted node.
Otherwise, return NULL.
(If index is < 0 or > length of the list, the insertion will fail.)
 Steps
1. Locate index’th element index’th
element
2. Allocate memory for the new node
3. Point the new node to its successor
4. Point the new node’s predecessor to the new node
newNode
 Possible cases of InsertNode
1. Insert into an empty list
2. Insert in front
3. Insert at back
4. Insert in middle
 But, in fact, only need to handle two cases
◦ Insert as the first node (Case 1 and Case 2)
◦ Insert in the middle or at the end of the list (Case 3
and Case 4)
Node* List::InsertNode(int index, double x) { Try to locate
if (index < 0) return NULL; index’th node. If it
doesn’t exist,
int currIndex = 1;
Node* currNode = head; return NULL.
while (currNode && index > currIndex) {
currNode = currNode->next;
currIndex++;
}
if (index > 0 && currNode == NULL) return NULL;

Node* newNode = new Node;


newNode->data = x;
if (index == 0) {
newNode->next = head;
head = newNode;
}
else {
newNode->next = currNode->next;
currNode->next = newNode;
}
return newNode;
}
Node* List::InsertNode(int index, double x) {
if (index < 0) return NULL;

int currIndex = 1;
Node* currNode = head;
while (currNode && index > currIndex) {
currNode = currNode->next;
currIndex++;
}
if (index > 0 && currNode == NULL) return NULL;

Node* newNode = new Node;


newNode->data = x;
if (index == 0) {
newNode->next = head; Create a new node
head = newNode;
}
else {
newNode->next = currNode->next;
currNode->next = newNode;
}
return newNode;
}
Node* List::InsertNode(int index, double x) {
if (index < 0) return NULL;

int currIndex = 1;
Node* currNode = head;
while (currNode && index > currIndex) {
currNode = currNode->next;
currIndex++;
}
if (index > 0 && currNode == NULL) return NULL;

Node* newNode = new Node;


Insert as first element
newNode->data = x;
if (index == 0) { head
newNode->next = head;
head = newNode;
}
else {
newNode->next = currNode->next; newNode
currNode->next = newNode;
}
return newNode;
}
Node* List::InsertNode(int index, double x) {
if (index < 0) return NULL;

int currIndex = 1;
Node* currNode = head;
while (currNode && index > currIndex) {
currNode = currNode->next;
currIndex++;
}
if (index > 0 && currNode == NULL) return NULL;

Node* newNode = new Node;


newNode->data = x;
if (index == 0) {
newNode->next = head;
head = newNode; Insert after currNode
}
currNode
else {
newNode->next = currNode->next;
currNode->next = newNode;
}
return newNode;
} newNode
 int FindNode(double x)
◦ Search for a node with the value equal to x in the list.
◦ If such a node is found, return its position. Otherwise,
return 0.

int List::FindNode(double x) {
Node* currNode = head;
int currIndex = 1;
while (currNode && currNode->data != x) {
currNode = currNode->next;
currIndex++;
}
if (currNode) return currIndex;
return 0;
}
 int DeleteNode(double x)
◦ Delete a node with the value equal to x from the list.
◦ If such a node is found, return its position. Otherwise,
return 0.
 Steps
◦ Find the desirable node (similar to FindNode)
◦ Release the memory occupied by the found node
◦ Set the pointer of the predecessor of the found node to
the successor of the found node
 Like InsertNode, there are two special cases
◦ Delete first node
◦ Delete the node in middle or at the end of the list
int List::DeleteNode(double x) {
Node* prevNode = NULL;
Try to find the node with
Node* currNode = head; its value equal to x
int currIndex = 1;
while (currNode && currNode->data != x) {
prevNode = currNode;
currNode = currNode->next;
currIndex++;
}
if (currNode) {
if (prevNode) {
prevNode->next = currNode->next;
delete currNode;
}
else {
head = currNode->next;
delete currNode;
}
return currIndex;
}
return 0;
}
int List::DeleteNode(double x) {
Node* prevNode = NULL;
Node* currNode = head;
int currIndex = 1;
while (currNode && currNode->data != x) {
prevNode = currNode;
currNode = currNode->next;
currIndex++; prevNode currNode
}
if (currNode) {
if (prevNode) {
prevNode->next = currNode->next;
delete currNode;
}
else {
head = currNode->next;
delete currNode;
}
return currIndex;
}
return 0;
}
int List::DeleteNode(double x) {
Node* prevNode = NULL;
Node* currNode = head;
int currIndex = 1;
while (currNode && currNode->data != x) {
prevNode = currNode;
currNode = currNode->next;
currIndex++;
}
if (currNode) {
if (prevNode) {
prevNode->next = currNode->next;
delete currNode;
}
else {
head = currNode->next;
delete currNode;
}
return currIndex;
} head currNode
return 0;
}
 void DisplayList(void)
◦ Print the data of all the elements
◦ Print the number of the nodes in the list

void List::DisplayList()
{
int num = 0;
Node* currNode = head;
while (currNode != NULL){
cout << currNode->data << endl;
currNode = currNode->next;
num++;
}
cout << "Number of nodes in the list: " << num << endl;
}
 ~List(void)
◦ Use the destructor to release all the memory used by the
list.
◦ Step through the list and delete each node one by one.
List::~List(void) {
Node* currNode = head, *nextNode = NULL;
while (currNode != NULL)
{
nextNode = currNode->next;
// destroy the current node
delete currNode;
currNode = nextNode;
}
}
6
7 result
5
Number of nodes in the list: 3
5.0 found
4.5 not found
6
int main(void) 5
{ Number of nodes in the list: 2

List list;
list.InsertNode(0, 7.0); // successful
list.InsertNode(1, 5.0); // successful
list.InsertNode(-1, 5.0); // unsuccessful
list.InsertNode(0, 6.0); // successful
list.InsertNode(8, 4.0); // unsuccessful
// print all the elements
list.DisplayList();
if(list.FindNode(5.0) > 0) cout << "5.0 found" << endl;
else cout << "5.0 not found" << endl;
if(list.FindNode(4.5) > 0) cout << "4.5 found" << endl;
else cout << "4.5 not found" << endl;
list.DeleteNode(7.0);
list.DisplayList();
return 0;
}
 Circular linked lists
◦ The last node points to the first node of the list

A B C

Head
◦ How do we know when we have finished
traversing the list? (Tip: check if the pointer of
the current node is equal to the head.)
 Doubly linked lists
◦ Each node points to not only successor but the
predecessor
◦ There are two NULL: at the first and last nodes in
the list
◦ Advantage: given a node, it is easy to visit its
predecessor. Convenient to traverse lists
backwards

 A B C 

Head
 Linked lists are more complex to code and
manage than arrays, but they have some distinct
advantages.
◦ Dynamic: a linked list can easily grow and shrink in size.
 We don’t need to know how many nodes will be in the list.
They are created in memory as needed.
 In contrast, the size of a C++ array is fixed at compilation
time.
◦ Easy and fast insertions and deletions
 To insert or delete an element in an array, we need to copy
to temporary variables to make room for new elements or
close the gap caused by deleted elements.
 With a linked list, no need to move other nodes. Only need
to reset some pointers.
head 48 17 142 //

 Follow the previous steps and we get

Step 1 Step 2

Step 3

head 93
 Insertion at the top of the list
 Insertion at the end of the list
 Insertion in the middle of the list
Steps:
 Create a Node
 Set the node data Values
 Connect the pointers
head 48 17 142 //

 Follow the previous steps and we get

Step 1 Step 2

Step 3
 Insertion at the top of the list
 Insertion at the end of the list
 Insertion in the middle of the list
Steps:
 Create a Node
 Set the node data Values
 Break pointer connection
 Re-connect the pointers
Step 1 Step 2

Step 3

Step 4
 Introduction
 Insertion Description
 Deletion Description
 Basic Node Implementation
 Conclusion
 Deleting from the top of the list
 Deleting from the end of the list
 Deleting from the middle of the list
 Deleting from the top of the list
 Deleting from the end of the list
 Deleting from the middle of the list
Steps
 Break the pointer connection
 Re-connect the nodes
 Delete the node
head

6 4 17 42

head

6 4 17 42

head

4 17 42
 Deleting from the top of the list
 Deleting from the end of the list
 Deleting from the middle of the list
Steps
 Break the pointer connection
 Set previous node pointer to NULL
 Delete the node
head

6 4 17 42
head

6 4 17 42

head

6 4 17
 Deleting from the top of the list
 Deleting from the end of the list
 Deleting from the middle of the list
Steps
 Set previous Node pointer to next node
 Break Node pointer connection
 Delete the node
head

4 17 42

head

4 17 42
head

4 42
The following code is written in C++:

Struct Node
{
int data; //any type of data could be another
struct
Node *next; //this is an important piece of code
“pointer”
};
1
 In a linked representation of a binary tree,
the number of null links (null pointers) are
actually more than non-null pointers.
 Consider the following binary tree:
 In above binary tree, there are 7 null pointers
& actual 5 pointers.
 In all there are 12 pointers.
 We can generalize it that for any binary tree
with n nodes there will be (n+1) null pointers
and 2n total pointers.
 The objective here to make effective use of
these null pointers.
 A. J. perils & C. Thornton jointly proposed idea
to make effective use of these null pointers.
 According to this idea we are going to replace
all the null pointers by the appropriate pointer
values called threads.
 And binary tree with such pointers are called
threaded tree.
 In the memory representation of a threaded
binary tree, it is necessary to distinguish
between a normal pointer and a thread.
 Therefore we have an alternate node
representation for a threaded binary tree
which contains five fields as show bellow:
 Also one may choose a one-way threading or a
two-way threading.
 Here, our threading will correspond to the in
order traversal of T.
 Accordingly, in the one way threading of T, a
thread will appear in the right field of a node
and will point to the next node in the in-order
traversal of T.
 See the bellow example of one-way in-order
threading.
Inorder of bellow tree is: D,B,F,E,A,G,C,L,J,H,K
 In the two-way threading of T.
 A thread will also appear in the left field of a
node and will point to the preceding node in
the in-order traversal of tree T.
 Furthermore, the left pointer of the first node
and the right pointer of the last node (in the
in-order traversal of T) will contain the null
value when T does not have a header node.
 Bellow figure show two-way in-order
threading.
 Here, right pointer=next node of in-order
traversal and left pointer=previous node of
in-order traversal
 Inorder of bellow tree is: D,B,F,E,A,G,C,L,J,H,K
 Again two-way threading has left pointer of
the first node and right pointer of the last
node (in the inorder traversal of T) will
contain the null value when T will point to
the header nodes is called two-way threading
with header node threaded binary tree.
 Bellow figure to explain two-way threading with
header node.
 Bellow example of link representation of
threading binary tree.
 In-order traversal of bellow tree:
G,F,B,A,D,C,E
 Advantages of threaded binary tree:
 Threaded binary trees have numerous
advantages over non-threaded binary trees
listed as below:
◦ The traversal operation is more faster than that of its
unthreaded version, because with threaded binary tree
non-recursive implementation is possible which can
run faster and does not require the botheration of
stack management.
 Advantages of threaded binary tree:
◦ The second advantage is more understated with a
threaded binary tree, we can efficiently determine the
predecessor and successor nodes starting from any
node. In case of unthreaded binary tree, however,
this task is more time consuming and difficult. For
this case a stack is required to provide upward
pointing information in the tree whereas in a
threaded binary tree, without having to include the
overhead of using a stack mechanism the same can
be carried out with the threads.
 Advantages of threaded binary tree:
◦ Any node can be accessible from any other node.
Threads are usually more to upward whereas links
are downward. Thus in a threaded tree, one can move
in their direction and nodes are in fact circularly
linked. This is not possible in unthreaded counter
part because there we can move only in downward
direction starting from root.
◦ Insertion into and deletions from a threaded tree are
although time consuming operations but these are
very easy to implement.
 Disadvantages of threaded binary tree:
◦ Insertion and deletion from a threaded tree are very
time consuming operation compare to non-threaded
binary tree.
◦ This tree require additional bit to identify the
threaded link.
 Property1: each node can have up to two
successor nodes (children)
◦ The predecessor node of a node is called its parent
◦ The "beginning" node is called the root (no parent)
◦ A node without children is called a leaf

 20

 21
 22
 23
A Tree Has a Root Node

ROOT NODE Owner


Jake

Manager Chef
Brad Carol

Waitress Waiter Cook Helper


Joyce Chris Max Len

23
Leaf nodes have no children

Owner
Jake

Manager Chef
Brad Carol

Waitress Waiter Cook Helper


Joyce Chris Max Len

LEAF NODES

24
 Property2: a unique path exists from the
root to every other node
 Ancestor of a node: any node on the path from
the root to that node
 Descendant of a node: any node on a path from
the node to the last node in the path
 Level (depth) of a node: number of edges in the
path from the root to that node
 Height of a tree: number of levels (warning:
some books define it as #levels - 1)
A Tree Has Levels

LEVEL 0 Owner
Jake

Manager Chef
Brad Carol

Waitress Waiter Cook Helper


Joyce Chris Max Len

27
Level One

Owner
Jake

Manager Chef
LEVEL 1 Brad Carol

Waitress Waiter Cook Helper


Joyce Chris Max Len

28
Level Two

Owner
Jake

Manager Chef
Brad Carol

LEVEL 2
Waitress Waiter Cook Helper
Joyce Chris Max Len

29
A Subtree

Owner
Jake

Manager Chef
Brad Carol

Waitress Waiter Cook Helper


Joyce Chris Max Len

LEFT SUBTREE OF ROOT NODE

30
Another Subtree

Owner
Jake

Manager Chef
Brad Carol

Waitress Waiter Cook Helper


Joyce Chris Max Len

RIGHT SUBTREE
OF ROOT NODE

31
The max #nodes at level is l
2l

h 1
N  2  2  ...  2
0 1
 2 1
h
l=0 l=1 l=h-1

using the geometric series:

n 1
x  x  ...  x
0 1 n 1
 x  i x n 1
x 1
i 0
2 1  N
h

 2  N 1
h

 The max height of a tree with N nodes is N


(same
asha  log( N
linked  1)  O (log N )
list)
 The min height of a tree with N nodes is
log(N+1)
1) (1) Start at the root
2) (2) Search the tree level by level, until you
find the element you are searching for
(O(N) time in worst case)

Is this better than searching a linked list?

No ---> O(N)
 Binary Search Tree Property: The value stored
at a node is greater than the value stored at
its left child and less than the value stored at
its right child
 Thus, the value stored at the root of a subtree
is greater than any value in its left subtree
and less than any value in its right subtree!!
1) (1) Start at the root
2) (2) Compare the value of the item you are
searching for with the value stored at the
root
3) (3) If the values are equal, then item found;
otherwise, if it is a leaf node, then not found
4) (4) If it is less than the value stored at the
root, then search the left subtree
5) (5) If it is greater than the value stored at
the root, then search the right subtree
6) (6) Repeat steps 2-6 for the root of the
subtree chosen in the previous step 4 or 5

Is this better than searching a linked list?

Yes !! ---> O(logN)


template<class ItemType>
struct TreeNode {
ItemType info;
TreeNode* left;
TreeNode* right; };
#include <fstream.h>

template<class ItemType>
struct TreeNode;

enum OrderType {PRE_ORDER, IN_ORDER, POST_ORDER};

template<class ItemType>
class TreeType {
public:
TreeType();
~TreeType();
TreeType(const TreeType<ItemType>&);
void operator=(const TreeType<ItemType>&);
void MakeEmpty();
bool IsEmpty() const;
bool IsFull() const;
int NumberOfNodes() const; (continues)
(cont.)
void RetrieveItem(ItemType&, bool& found);
void InsertItem(ItemType);
void DeleteItem(ItemType);
void ResetTree(OrderType);
void GetNextItem(ItemType&, OrderType, bool&);
void PrintTree(ofstream&) const;
private:
TreeNode<ItemType>* root;
};

};
 Recursive implementation
#nodes in a tree =
#nodes in left subtree + #nodes in right
subtree + 1
 What is the size factor?
Number of nodes in the tree we are examining
 What is the base case?
The tree is empty
 What is the general case?
CountNodes(Left(tree)) + CountNodes(Right(tree))
+1
template<class ItemType>
int TreeType<ItemType>::NumberOfNodes() const
{
return CountNodes(root);
}

template<class ItemType>
int CountNodes(TreeNode<ItemType>* tree)
{
if (tree == NULL)
return 0;
else
return CountNodes(tree->left) + CountNodes(tree->right) + 1;
}
Let’s consider the first few steps:
 What is the size of the problem?
Number of nodes in the tree we are examining
 What is the base case(s)?
1) When the key is found
2) The tree is empty (key was not found)
 What is the general case?
Search in the left or right subtrees
template <class ItemType>
void TreeType<ItemType>:: RetrieveItem(ItemType& item,bool& found)
{
Retrieve(root, item, found);
}

template<class ItemType>
void Retrieve(TreeNode<ItemType>* tree,ItemType& item,bool& found)
{
if (tree == NULL) // base case 2
found = false;
else if(item < tree->info)
Retrieve(tree->left, item, found);
else if(item > tree->info)
Retrieve(tree->right, item, found);
else { // base case 1
item = tree->info;
found = true;
}
}
 Use the
binary
search tree
property to
insert the
new item at
the correct
place
Function
InsertItem
(cont.)
• Implementing
insertion using
recursion
Insert 11
 What is the size of the problem?
Number of nodes in the tree we are examining
 What is the base case(s)?
The tree is empty
 What is the general case?
Choose the left or right subtree
template<class ItemType>
void TreeType<ItemType>::InsertItem(ItemType item)
{
Insert(root, item);
}
template<class ItemType>
void Insert(TreeNode<ItemType>*& tree, ItemType item)
{
if(tree == NULL) { // base case
tree = new TreeNode<ItemType>;
tree->right = NULL;
tree->left = NULL;
tree->info = item;
}
else if(item < tree->info)
Insert(tree->left, item);
else
Insert(tree->right, item);
}
Insert 11
 Yes, certain orders produce very unbalanced
trees!!
 Unbalanced trees are not desirable because
search time increases!!
 There are advanced tree structures (e.g.,"red-
black trees") which guarantee balanced trees
Does the
order of
inserting
elements
into a tree
matter?
(cont.)
 First, find the item; then, delete it
 Important: binary search tree property
must be preserved!!
 We need to consider three different cases:
(1) Deleting a leaf
(2) Deleting a node with only one child
(3) Deleting a node with two children
 Find predecessor (it is the rightmost node
in the left subtree)
 Replace the data of the node to be deleted
with predecessor's data
 Delete predecessor node
 What is the size of the problem?
Number of nodes in the tree we are examining
 What is the base case(s)?
Key to be deleted was found
 What is the general case?
Choose the left or right subtree
template<class ItemType>
void TreeType<ItmeType>::DeleteItem(ItemType item)
{
Delete(root, item);
}

template<class ItemType>
void Delete(TreeNode<ItemType>*& tree, ItemType item)
{
if(item < tree->info)
Delete(tree->left, item);
else if(item > tree->info)
Delete(tree->right, item);
else
DeleteNode(tree);
}
template <class ItemType>
void DeleteNode(TreeNode<ItemType>*& tree)
{
ItemType data;
TreeNode<ItemType>* tempPtr;

tempPtr = tree;
if(tree->left == NULL) { //right child
tree = tree->right;
delete tempPtr; 0 or 1 child
}
else if(tree->right == NULL) { // left child
tree = tree->left;
delete tempPtr; 0 or 1 child
}
else {
GetPredecessor(tree->left, data);
tree->info = data;
Delete(tree->left, data); 2 children
}
}
template<class ItemType>
void GetPredecessor(TreeNode<ItemType>* tree, ItemType& data)
{
while(tree->right != NULL)
tree = tree->right;
data = tree->info;
}
There are mainly three ways to traverse a
tree:
1) Inorder Traversal
2) Postorder Traversal
3) Preorder Traversal
Visit second
tree

‘J’

‘E’ ‘T’

‘A’ ‘H’ ‘M’ ‘Y’

Visit left subtree first Visit right subtree last

66
 Visit the nodes in the left subtree, then
visit the root of the tree, then visit the
nodes in the right subtree
Inorder(tree)
If tree is not NULL
Inorder(Left(tree))
Visit Info(tree)
Inorder(Right(tree))

(Warning: "visit" means that the algorithm


does something with the values in the
node, e.g., print the value)
Postorder Traversal: A H E M Y T J
Visit last
tree

‘J’

‘E’ ‘T’

‘A’ ‘H’ ‘M’ ‘Y’

Visit left subtree first Visit right subtree second

68
 Visit the nodes in the left subtree first,
then visit the nodes in the right subtree,
then visit the root of the tree
Postorder(tree)
If tree is not NULL
Postorder(Left(tree))
Postorder(Right(tree))
Visit Info(tree)
Visit first
tree

‘J’

‘E’ ‘T’

‘A’ ‘H’ ‘M’ ‘Y’

Visit left subtree second Visit right subtree last

70
 Visit the root of the tree first, then visit the
nodes in the left subtree, then visit the
nodes in the right subtree
Preorder(tree)
If tree is not NULL
Visit Info(tree)
Preorder(Left(tree))
Preorder(Right(tree))
 We use "inorder" to print out the node values
 Why?? (keys are printed out in ascending
order!!)
 Hint: use binary search trees for sorting !!

ADJMQRT
void TreeType::PrintTree(ofstream& outFile)
{
Print(root, outFile);
}

template<class ItemType>
void Print(TreeNode<ItemType>* tree, ofstream& outFile)
{
if(tree != NULL) {
Print(tree->left, outFile);
outFile << tree->info;
Print(tree->right, outFile);
}
}

(see textbook for overloading <<


and >>)
template<class ItemType>
TreeType<ItemType>::TreeType()
{
root = NULL;
}
How should we
delete the nodes
of a tree?
 Delete the tree in a "bottom-up" fashion
 Postorder traversal is appropriate for this
!!
TreeType::~TreeType()
{
Destroy(root);
}
void Destroy(TreeNode<ItemType>*& tree)
{
if(tree != NULL) {
Destroy(tree->left);
Destroy(tree->right);
delete tree;
}
}
How should we
create a copy of
a tree?
template<class ItemType>
TreeType<ItemType>::TreeType(const TreeType<ItemType>&
originalTree)
{
CopyTree(root, originalTree.root);
}

template<class ItemType)
void CopyTree(TreeNode<ItemType>*& copy,
TreeNode<ItemType>* originalTree)
{
if(originalTree == NULL)
copy = NULL;
else {
copy = new TreeNode<ItemType>;
copy->info = originalTree->info;
CopyTree(copy->left, originalTree->left);
CopyTree(copy->right, originalTree->right); preorder
}
}
 The user is allowed to specify the tree
traversal order
 For efficiency, ResetTree stores in a queue
the results of the specified tree traversal
 Then, GetNextItem, dequeues the node
values from the queue
enum OrderType {PRE_ORDER, IN_ORDER,
POST_ORDER};

template<class ItemType>
class TreeType {
public:
// same as before
private:
TreeNode<ItemType>* root;
QueType<ItemType> preQue;
QueType<ItemType> inQue; new private data
QueType<ItemType> postQue;
};
template<class ItemType>
void PreOrder(TreeNode<ItemType>*,
QueType<ItemType>&);

template<class ItemType>
void InOrder(TreeNode<ItemType>*,
QueType<ItemType>&);

template<class ItemType>
void PostOrder(TreeNode<ItemType>*,
QueType<ItemType>&);
template<class ItemType>
void PreOrder(TreeNode<ItemType>tree,
QueType<ItemType>& preQue)
{
if(tree != NULL) {
preQue.Enqueue(tree->info);
PreOrder(tree->left, preQue);
PreOrder(tree->right, preQue);
}
}
template<class ItemType>
void InOrder(TreeNode<ItemType>tree,
QueType<ItemType>& inQue)
{
if(tree != NULL) {
InOrder(tree->left, inQue);
inQue.Enqueue(tree->info);
InOrder(tree->right, inQue);
}
}
template<class ItemType>
void PostOrder(TreeNode<ItemType>tree,
QueType<ItemType>& postQue)
{
if(tree != NULL) {
PostOrder(tree->left, postQue);
PostOrder(tree->right, postQue);
postQue.Enqueue(tree->info);
}
}
template<class ItemType>
void TreeType<ItemType>::ResetTree(OrderType order)
{
switch(order) {
case PRE_ORDER: PreOrder(root, preQue);
break;
case IN_ORDER: InOrder(root, inQue);
break;
case POST_ORDER: PostOrder(root, postQue);
break;
}
}
template<class ItemType>
void TreeType<ItemType>::GetNextItem(ItemType& item,
OrderType order, bool& finished)
{
finished = false;
switch(order) {
case PRE_ORDER: preQue.Dequeue(item);
if(preQue.IsEmpty())
finished = true;
break;
case IN_ORDER: inQue.Dequeue(item);
if(inQue.IsEmpty())
finished = true;
break;
case POST_ORDER: postQue.Dequeue(item);
if(postQue.IsEmpty())
finished = true;
break;
}
}
 See textbook
Big-O Comparison
Binary Array- Linked
Operation
Search Tree based List List
Constructor O(1) O(1) O(1)
Destructor O(N) O(1) O(N)
IsFull O(1) O(1) O(1)
IsEmpty O(1) O(1) O(1)
RetrieveItem O(logN) O(logN) O(N)
InsertItem O(logN) O(N) O(N)
DeleteItem O(logN) O(N) O(N)
 1-3, 8-18, 21, 22, 29-32
1
Definition
 A graph G consists of two sets
– a finite, nonempty set of vertices V(G)
– a finite, possible empty set of edges E(G)
– G(V,E) represents a graph
 An undirected graph is one in which the pair of
vertices in a edge is unordered, (v0, v1) = (v1,v0)
 A directed graph is one in which each edge is a
directed pair of vertices, <v0, v1> != <v1,v0>
tail head

2
Examples for Graph
0 0 0

1 2 1 2
1
3
3 4 5 6
G1 2
G2
complete graph incomplete graph G3
V(G1)={0,1,2,3} E(G1)={(0,1),(0,2),(0,3),(1,2),(1,3),(2,3)}
V(G2)={0,1,2,3,4,5,6} E(G2)={(0,1),(0,2),(1,3),(1,4),(2,5),(2,6)}
V(G3)={0,1,2} E(G3)={<0,1>,<1,0>,<1,2>}
complete undirected graph: n(n-1)/2 edges
complete directed graph: n(n-1) edges
3
Complete Graph
 A complete graph is a graph that has the
maximum number of edges
– for undirected graph with n vertices, the maximum
number of edges is n(n-1)/2
– for directed graph with n vertices, the maximum
number of edges is n(n-1)
– example: G1 is a complete graph

4
Adjacent and Incident
 If (v0, v1) is an edge in an undirected graph,
– v0 and v1 are adjacent
– The edge (v0, v1) is incident on vertices v0 and v1
 If <v0, v1> is an edge in a directed graph
– v0 is adjacent to v1, and v1 is adjacent from v0
– The edge <v0, v1> is incident on v0 and v1

5
*Figure 6.3:Example of a graph with feedback loops and a
multigraph (p.260)
0

0 2 1 3

1 2
self edge multigraph:
(a) (b) multiple occurrences
of the same edge

6
 A subgraph of G is a graph G’ such that V(G’)
is a subset of V(G) and E(G’) is a subset of E(G)
 A path from vertex vp to vertex vq in a graph G,
is a sequence of vertices, vp, vi1, vi2, ..., vin, vq,
such that (vp, vi1), (vi1, vi2), ..., (vin, vq) are edges
in an undirected graph
 The length of a path is the number of edges on
it

7
0 0 0 1 2 0

1 2 1 2 3 1 2
3
3
G1 (i) (ii) (iii) (iv)
(a) Some of the subgraph of G1

0 0 0 0
0
單一 1 1 1
1 分開
2 2
(i) (ii) (iii) (iv)
2 (b) Some of the subgraph of G3

G3 8
 A simple path is a path in which all vertices,
except possibly the first and the last, are distinct
 A cycle is a simple path in which the first and
the last vertices are the same
 In an undirected graph G, two vertices, v0 and v1,
are connected if there is a path in G from v0 to v1
 An undirected graph is connected if, for every
pair of distinct vertices vi, vj, there is a path
from vi to vj
9
connected

0 0

1 2 1 2
3
3 4 5 6
G1
G2
tree (acyclic graph)

10
 A connected component of an undirected graph
is a maximal connected subgraph.
 A tree is a graph that is connected and acyclic.
 A directed graph is strongly connected if there
is a directed path from vi to vj and also
from vj to vi.
 A strongly connected component is a maximal
subgraph that is strongly connected.
11
*Figure 6.5: A graph with two connected components (p.262)
connected component (maximal connected subgraph)

H1 0 H2 4

2 1 5

3 6

G4 (not connected)
12
*Figure 6.6: Strongly connected components of G3 (p.262)
strongly connected component
not strongly connected (maximal strongly connected subgraph)

0
0 2

1
2
G3

13
Degree
 The degree of a vertex is the number of edges
incident to that vertex
 For directed graph,
– the in-degree of a vertex v is the number of edges
that have v as the head
– the out-degree of a vertex v is the number of edges
that have v as the tail
– if di is the degree of a vertex i in a graph G with n
vertices and e edges, the number of edges is
n 1

e( d ) / 2
0
i

14
undirected graph
degree
3 0
0 2
1 2
3 1 2 3 3 3
3 4 5 6
3
G13 1 1 G2 1 1
0 in:1, out: 1
directed graph
in-degree
out-degree 1 in: 1, out: 2

2 in: 1, out: 0
G3
15
ADT for Graph
structure Graph is
objects: a nonempty set of vertices and a set of undirected edges, where each
edge is a pair of vertices
functions: for all graph  Graph, v, v1 and v2  Vertices
Graph Create()::=return an empty graph
Graph InsertVertex(graph, v)::= return a graph with v inserted. v has no
incident edge.
Graph InsertEdge(graph, v1,v2)::= return a graph with new edge
between v1 and v2
Graph DeleteVertex(graph, v)::= return a graph in which v and all edges
incident to it are removed
Graph DeleteEdge(graph, v1, v2)::=return a graph in which the edge (v1, v2)
is removed
Boolean IsEmpty(graph)::= if (graph==empty graph) return TRUE
else return FALSE
List Adjacent(graph,v)::= return a list of all vertices that are adjacent to v

16
Graph Representations
 Adjacency Matrix
 Adjacency Lists

17
Adjacency Matrix
 Let G=(V,E) be a graph with n vertices.
 The adjacency matrix of G is a two-dimensional
n by n array, say adj_mat
 If the edge (vi, vj) is in E(G), adj_mat[i][j]=1
 If there is no such edge in E(G), adj_mat[i][j]=0
 The adjacency matrix for an undirected graph is
symmetric; the adjacency matrix for a digraph
need not be symmetric
18
Examples for Adjacency Matrix
0 0 4
0
2 1 5
1 2
3 6
3 1
0 1 1 1  0 1 0
1 0 1 1    7
  1 0 1 
2 0 1 1 0 0 0 0 0
1 1 0 0 0 1 0
1 0
   0 0 1 0 0 0
1 1 1 0
1 0 0 1 0 0 0 0
G2  
G1
0 1 1 0 0 0 0 0
0 0 0 0 0 1 0 0
 
0 0 0 0 1 0 1 0
symmetric 0 0 0 0 0 1 0 1
 
undirected: n2/2 0 0 0 0 0 0 1 0
directed: n2
G4 19
Merits of Adjacency Matrix
 From the adjacency matrix, to determine the
connection of vertices is easy n 1

 The degree of a vertex is  adj _ mat[i][ j ]


j 0

 For a digraph, the row sum is the out_degree,


while the column sum is the in_degree
n 1 n 1
ind (vi )   A[ j , i ] outd (vi )   A[i , j ]
j 0 j 0

20
Data Structures for Adjacency Lists
Each row in adjacency matrix is represented as an adjacency list.

#define MAX_VERTICES 50
typedef struct node *node_pointer;
typedef struct node {
int vertex;
struct node *link;
};
node_pointer graph[MAX_VERTICES];
int n=0; /* vertices currently in use */

21
0 0 4
2 1 5
1 2 3 6
3 7
0 1 2 3 0 1 2
1 0 2 3 1 0 3
2 0 1 3 2 0 3
3 0 1 2 3 1 2
G1 0 4 5
5 4 6
0 1 6 5 7
1 0 2 1
7 6
2
G3 G4
2
An undirected graph with n vertices and e edges ==> nCHAPTER
head6 nodes and 2e list nodes
22
degree of a vertex in an undirected graph
–# of nodes in adjacency list
# of edges in a graph
–determined in O(n+e)
out-degree of a vertex in a directed graph
–# of nodes in its adjacency list
in-degree of a vertex in a directed graph
–traverse the whole data structure

23
0 4
node[0] … node[n-1]: starting point for vertices
2 1 5 node[n]: n+2e+1
3 6 node[n+1] … node[n+2e]: head node of edge

7
[0] 9 [8] 23 [16] 2
[1] 11 0 [9] 1 4 [17] 5
[2] 13 [10] 2 5 [18] 4
[3] 15 1 [11] 0 [19] 6
[4] 17 [12] 3 6 [20] 5
[5] 18 2 [13] 0 [21] 7
[6] 20 [14] 3 7 [22] 6
[7] 22 3 [15] 1

24
0
0  1 NULL

1 1  0 NULL

2  1 NULL
2

Determine in-degree of a vertex in a fast way.

25
tail head column link for head row link for tail

CHAPTER 6 26
0 1 2

0 0 1 NULL NULL

1 1 0 NULL 1 2 NULL NULL

0
2 NULL
 0 1 0
 
 1 0 1 
0 0 0
1

2
27
Order is of no significance.
headnodes vertax link

0  3  1  2 NULL

1  2  0  3 NULL

2  3  0  1 NULL

3  2  1  0 NULL

1 2
3 28
Some Graph Operations
 Traversal
Given G=(V,E) and vertex v, find all wV,
such that w connects v.
– Depth First Search (DFS)
preorder tree traversal
– Breadth First Search (BFS)
level order tree traversal
 Connected Components
 Spanning Trees
29
*Figure 6.19:Graph G and its adjacency lists (p.274)
depth first search: v0, v1, v3, v7, v4, v5, v2, v6

breadth first search: v0, v1, v2, v3, v4, v5, v6, v7
30
Depth First Search
#define FALSE 0
#define TRUE 1
short int visited[MAX_VERTICES];
void dfs(int v)
{
node_pointer w;
visited[v]= TRUE;
printf(“%5d”, v);
for (w=graph[v]; w; w=w->link)
if (!visited[w->vertex])
dfs(w->vertex); Data structure
} adjacency list: O(e)
adjacency matrix: O(n2)
31
Breadth First Search
typedef struct queue *queue_pointer;
typedef struct queue {
int vertex;
queue_pointer link;
};
void addq(queue_pointer *,
queue_pointer *, int);
int deleteq(queue_pointer *);

32
Breadth First Search (Continued)
void bfs(int v)
{
node_pointer w;
queue_pointer front, rear;
front = rear = NULL;
adjacency list: O(e)
printf(“%5d”, v); adjacency matrix: O(n2)
visited[v] = TRUE;
addq(&front, &rear, v);

33
while (front) {
v= deleteq(&front);
for (w=graph[v]; w; w=w->link)
if (!visited[w->vertex]) {
printf(“%5d”, w->vertex);
addq(&front, &rear, w->vertex);
visited[w->vertex] = TRUE;
}
}
}

34
Connected Components
void connected(void)
{
for (i=0; i<n; i++) {
if (!visited[i]) {
dfs(i);
printf(“\n”);
}
adjacency list: O(n+e)
} adjacency matrix: O(n2)
}
35
Topics
 Sequential Search on an Unordered File
 Sequential Search on an Ordered File
 Binary Search
 Bubble Sort
 Insertion Sort
 There are some very common problems that
we use computers to solve:
◦ Searching through a lot of records for a specific
record or set of records
◦ Placing records in order, which we call sorting
 There are numerous algorithms to perform
searches and sorts. We will briefly explore
a few common ones.
 A question you should always ask when
selecting a search algorithm is “How fast does
the search have to be?” The reason is that, in
general, the faster the algorithm is, the more
complex it is.
 Bottom line: you don’t always need to use or
should use the fastest algorithm.
 Let’s explore the following search algorithms,
keeping speed in mind.
◦ Sequential (linear) search
◦ Binary search
 Basic algorithm:
Get the search criterion (key)
Get the first record from the file
While ( (record != key) and (still more records) )
Get the next record
End_while

 When do we know that there wasn’t a record in


the file that matched the key?
 Basic algorithm:
Get the search criterion (key)
Get the first record from the file
While ( (record < key) and (still more records) )
Get the next record
End_while
If ( record = key )
Then success
Else there is no match in the file
End_else
 When do we know that there wasn’t a record
in the file that matched the key?
 Let’s do a comparison.
 If the order was ascending alphabetical on
customer’s last names, how would the search
for John Adams on the ordered list compare
with the search on the unordered list?
◦ Unordered list
 if John Adams was in the list?
 if John Adams was not in the list?
◦ Ordered list
 if John Adams was in the list?
 if John Adams was not in the list?
 How about George Washington?
◦ Unordered
 if George Washington was in the list?
 If George Washington was not in the list?
◦ Ordered
 if George Washington was in the list?
 If George Washington was not in the list?
 How about James Madison?
 Observation: the search is faster on an ordered
list only when the item being searched for is not
in the list.
 Also, keep in mind that the list has to first be
placed in order for the ordered search.
 Conclusion: the efficiency of these algorithms
is roughly the same.
 So, if we need a faster search, we need a
completely different algorithm.
 How else could we search an ordered file?
 If we have an ordered list and we know how
many things are in the list (i.e., number of
records in a file), we can use a different
strategy.
 The binary search gets its name because the
algorithm continually divides the list into two
parts.
Always look at the center
value. Each time you get
to discard half of the
remaining list.

Is this fast ?
 Worst case: 11 items in the list took 4 tries
 How about the worst case for a list with 32
items ?
◦ 1st try - list has 16 items
◦ 2nd try - list has 8 items
◦ 3rd try - list has 4 items
◦ 4th try - list has 2 items
◦ 5th try - list has 1 item
List has 250 items List has 512 items

1st try - 125 1st try - 256


items items
2nd try - 63 items 2nd try - 128
3rd try - 32 items items
4th try - 16 items 3rd try - 64 items
5th try - 8 items 4th try - 32 items
6th try - 4 items 5th try - 16 items
7th try - 2 items 6th try - 8 items
8th try - 1 item 7th try - 4 items
8th try - 2 items
 List of 11 took 4 tries
 List of 32 took 5 tries
 List of 250 took 8 tries
 List of 512 took 9 tries

 32 = 25 and 512 = 29
 8 < 11 < 16 23 < 11 < 24
 128 < 250 < 256 27 < 250 < 28
 How long (worst case) will it take to find an
item in a list 30,000 items long?
210 = 1024 213 = 8192
211 = 2048 214 = 16384
212 = 4096 215 = 32768

 So, it will take only 15 tries!


 We say that the binary search algorithm runs in
log2 n time. (Also written as lg n)
 Lg n means the log to the base 2 of some value
of n.
 8 = 23 lg 8 = 3 16 = 24 lg 16 = 4
 There are no algorithms that run faster than lg
n time.
 So, the binary search is a very fast search
algorithm.
 But, the list has to be sorted before we can
search it with binary search.
 To be really efficient, we also need a fast sort
algorithm.
Bubble Sort Heap Sort
Selection Sort Merge Sort
Insertion Sort Quick Sort
 There are many known sorting algorithms.
Bubble sort is the slowest, running in n2 time.
Quick sort is the fastest, running in n lg n
time.
 As with searching, the faster the sorting
algorithm, the more complex it tends to be.
 We will examine two sorting algorithms:
◦ Bubble sort
◦ Insertion sort
void bubbleSort (int a[ ] , int size)
{
int i, j, temp;
for ( i = 0; i < size; i++ ) /* controls passes through the list */
{
for ( j = 0; j < size - 1; j++ ) /* performs adjacent comparisons
*/
{
if ( a[ j ] > a[ j+1 ] ) /* determines if a swap should
occur */
{
temp = a[ j ]; /* swap is performed */
a[ j ] = a[ j + 1 ];
a[ j+1 ] = temp;
}
}
}
 Insertion sort is slower than quick sort, but
not as slow as bubble sort, and it is easy to
understand.
 Insertion sort works the same way as
arranging your hand when playing cards.
◦ Out of the pile of unsorted cards that were dealt to
you, you pick up a card and place it in your hand in
the correct position relative to the cards you’re
already holding.
7

5 7
5 7

5 6 7

5 6 7 K

5 6 7 8 K
Unsorted - shaded
7 K
Look at 2nd item - 5.
Compare 5 to 7.
7 5
1 5 is smaller, so move 5
to temp, leaving
v an empty slot in
7 5 position 2.
Move 7 into the empty
slot, leaving position 1
7 open.
>
2
Move 5 into the open
5 7 3 position.
<
5 7 6 K Look at next item - 6.
Compare to 1st - 5.
1 6 is larger, so leave 5.
5 7 Compare to next - 7.
6 is smaller, so move
v 6 to temp, leaving an
5 7 6 empty slot.
Move 7 into the
empty
5 7 slot, leaving position
22
>
open.
5 6 7 3
<
Move 6 to the open
2nd position.
Look at next item -
King.
5 6 7 K Compare to 1st - 5.
King is larger, so
leave 5 where
it is.
Compare to next -
6. King is larger, so
leave 6
where it is.
Compare to next - 7.
King is larger, so
leave 7 where it is.
5 6 7 K 8

6 1
5 7 K 8

v
5 6 7 K 8

5 6 7 K
>
2

5 6 7 8 K 3
<
 In CS, a hash table, or a hash map, is a data
structure that associates keys (names) with
values (attributes).

◦ Look-Up Table
◦ Dictionary
◦ Cache
◦ Extended Array
A small phone book as a hash table.
(Figure is from Wikipedia)
 Collection of pairs.
◦ (key, value)
◦ Each pair has a unique key.
 Operations.
◦ Get(theKey)
◦ Delete(theKey)
◦ Insert(theKey, theValue)
 Hash table :
◦ Collection of pairs,
◦ Lookup function (Hash function)
 Hash tables are often used to implement
associative arrays,
◦ Worst-case time for Get, Insert, and Delete is
O(size).
◦ Expected time is O(1).
 Search tree methods: key comparisons
◦ Time complexity: O(size) or O(log n)
 Hashing methods: hash functions
◦ Expected time: O(1)
 Types
◦ Static hashing (section 8.2)
◦ Dynamic hashing (section 8.3)
 Key-value pairs are stored in a fixed size
table called a hash table.
◦ A hash table is partitioned into many buckets.
◦ Each bucket has many slots.
◦ Each slot holds one record.
◦ A hash function f(x) transforms the identifier (key)
into an address in the hash table
s slots
0 1 s-1
0 . . .

1
b buckets

. . .
. . .
. . .
. . .
b-1
#define MAX_CHAR 10
#define TABLE_SIZE 13
typedef struct {
char key[MAX_CHAR];
/* other fields */
} element;
element hash_table[TABLE_SIZE];
 Open addressing ensures that all elements
are stored directly into the hash table, thus
it attempts to resolve collisions using
various methods.

 Linear Probing resolves collisions by placing


the data into the next open slot in the table.
 divisor = b (number of buckets) = 17.
 Home bucket = key % 17.

0 4 8 12 16
34 0 45 6 23 7 28 12 29 11 30 33

• Insert pairs whose keys are 6, 12, 34, 29, 28, 11,
23, 7, 0, 33, 30, 45
0 4 8 12 16
34 0 45 6 23 7 28 12 29 11 30 33

 Delete(0)
0 4 8 12 16
34 45 6 23 7 28 12 29 11 30 33

• Search cluster for pair (if any) to fill vacated bucket.

0 4 8 12 16
34 45 6 23 7 28 12 29 11 30 33
0 4 8 12 16
34 0 45 6 23 7 28 12 29 11 30 33
0 4 8 12 16
0 45 6 23 7 28 12 29 11 30 33

 Search cluster for pair (if any) to fill vacated


bucket.
0 4 8 12 16
0 45 6 23 7 28 12 29 11 30 33

0 4 8 12 16
0 45 6 23 7 28 12 29 11 30 33
0 4 8 12 16
34 0 45 6 23 7 28 12 29 11 30 33
0 4 8 12 16
34 0 45 6 23 7 28 12 11 30 33

 Search cluster for pair (if any) to fill vacated


bucket.
0 4 8 12 16
34 0 45 6 23 7 28 12 11 30 33
0 4 8 12 16
34 0 45 6 23 7 28 12 11 30 33
0 4 8 12 16
34 0 6 23 7 28 12 11 30 45 33
void linear_insert(element item, element ht[]){
int i, hash_value;
i = hash_value = hash(item.key);
while(strlen(ht[i].key)) {
if (!strcmp(ht[i].key, item.key)) {
fprintf(stderr, “Duplicate entry\n”); exit(1);
}
i = (i+1)%TABLE_SIZE;
if (i == hash_value) {
fprintf(stderr, “The table is full\n”); exit(1);
} }
ht[i] = item;
}
 Identifiers tend to cluster together
 Adjacent cluster tend to coalesce
 Increase the search time
Identifiers Binary representaiton Example:
a0 100 000 M (# of pages)=4,
a1 100 001 P (page capacity)=2
b0 101 000
b1 101 001 Allocation: lower order
c0 110 000 two bits
c1 110 001
c2 110 010
c3 110 011

Figure 8.8:Some identifiers requiring 3 bits per character(p.414)


Figure 8.9: A trie to hole
identifiers (p.415)

Read it in reverse
order.
c5: 110 101
c1: 110 001
 We need to consider some issues!
◦ Skewed Tree,
◦ Access time increased.
 Fagin et. al. proposed extendible hashing to
solve above problems.
◦ Ronald Fagin, Jürg Nievergelt, Nicholas
Pippenger, and H. Raymond Strong, Extendible
Hashing - A Fast Access Method for Dynamic
Files, ACM Transactions on Database Systems,
4(3):315-344, 1979.
 A directories is a table of pointer of pages.
 The directory has k bits to index 2^k entries.
 We could use a hash function to get the
address of entry of directory, and find the
page contents at the page.
The directory of
the three tries of
Figure 8.9
It is obvious that the directories will grow
very large if the hash function is clustering.
Therefore, we need to adopt the uniform
hash function to translate the bits
sequence of keys to the random bits
sequence.
Moreover, we need a family of uniform
hash functions, since the directory will
grow.

You might also like