A-Level 14 Presentation - Compression, Encryption and Hashing
A-Level 14 Presentation - Compression, Encryption and Hashing
A-level
Compression,
encryption and hashing
teachcomputerscience.com
2
Lesson Objectives
Students will learn about:
▪ Why compressing files is important.
▪ How text, image, audio, and video files are compressed.
▪ Effects of compressing a file.
▪ What the various file formats are.
▪ Compression algorithms: Run-length encoding and Huffman
coding.
▪ Encryption: Symmetric and Asymmetric.
▪ Hashing, digital certificates, and digital signatures.
teachcomputerscience.com
1.
Content
teachcomputerscience.com
4
Introduction
▪ File handling is one of the primary functions of a computer system.
▪ Based on the type of data that needs to be stored, several types of file
formats are available.
▪ Each file format occupies a certain amount of storage space.
▪ An image file with good quality occupies around 1 MB, and a video file
needs to store 25 frames per second, occupying a large amount of
storage space. Thus, compression methods are used to reduce the size
of the files.
▪ Compression is also helpful in reducing the download time of image,
audio, and video files from the Internet.
teachcomputerscience.com
5
teachcomputerscience.com
6
Lossless compression
▪ When the file is compressed, the quality of the image remains the
same.
▪ The image can be reconstructed into its original form.
▪ In this case, information is very important and cannot be lost.
teachcomputerscience.com
7
Lossless compression Index word Index word
▪ Let us consider a text file with the 1 see 10 day
following sentence: 2 a 11 you’ll
▪ “See a pin and pick it up, all the day 3 pin 12 have
you'll have good luck; see a pin and let
4 and 13 good
it lie, bad luck you'll have all day”.
5 pick 14 luck
▪ This text file can be compressed by
making a table for this information . 6 it 15 let
“See a pin and pick it up, all the day you'll have good Index word Index word
luck; see a pin and let it lie, bad luck you'll have all day” 1 see 10 day
2 a 11 you’ll
▪ The sentence can be coded in the form
3 pin 12 have
of numbers in the table and stored in
the computer: 4 and 13 good
▪ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 5 pick 14 luck
4 15 6 16 17 14 11 12 8 9 10 6 it 15 let
▪ This saves memory by using codes for 7 up 16 lie
words that are repeated.
8 all 17 bad
▪ With the code and the index table, the
9 the
complete sentence can be recreated.
teachcomputerscience.com
9
Lossy compression
▪ When a file is compressed, the unnecessary bits of information are
removed permanently.
▪ This information is less likely to be noticed by humans.
▪ This type of compression is used for photographs where the
information to be compressed cannot be predicted.
teachcomputerscience.com
10
teachcomputerscience.com
11
Videos
▪ Digital videos are created by playing a series of images at high speed.
▪ A typical HD video has a frame rate of 60 fps.
▪ Advanced video standards support up to 300 fps. The sampling rate
of a video is given in frames per second. This is also measured in
Hertz.
▪ Video files also have a bit rate that defines the quality of audio and
image.
teachcomputerscience.com
13
teachcomputerscience.com
14
teachcomputerscience.com
15
teachcomputerscience.com
16
teachcomputerscience.com
18
teachcomputerscience.com
19
teachcomputerscience.com
20
teachcomputerscience.com
21
teachcomputerscience.com
22
teachcomputerscience.com
23
Huffman coding
▪ A compression technique used to reduce the number of bits that
represents each letter.
▪ A binary tree is used to encode letters.
▪ A binary tree is a data structure made of nodes and is constructed
based on hierarchy. A parent node in a binary tree has up to two
child nodes.
teachcomputerscience.com
24
Huffman coding
▪ In ASCII coding, each letter is represented using 7 bits.
▪ In Huffman coding, each letter is represented with a different
number of bits.
▪ The most frequently appearing letters are represented with less
number of bits.
▪ The number of bits required to store information is reduced.
teachcomputerscience.com
25
Huffman coding
▪ Consider the sentence: Betty ate butter.
▪ The frequency of characters in this sentence is shown in the table.
▪ There are 17 characters in total (including spaces).
▪ Therefore, the total number of bits used to represent their ASCII
codes is: 17×7= 119 bits.
Letter A B E R T U Y Space
Frequency 1 2 3 1 5 1 1 3
teachcomputerscience.com
26
Huffman coding 0 1
0 T 1 0 E
1
▪ Consider the sentence:
B
Betty ate butter. 0 1 Sp 0 1
.
▪ Each letter is now A R U Y
assigned a binary value:
Letter A B E R T U Y Space
Frequency 1 2 3 1 5 1 1 2
Huffman coding
▪ Substituting these values in the sentence and calculating the total
number of bits: 3 + 4 + 3 + 3 + 5 + 3 + 3 +4 = 28 bits.
▪ Using Huffman coding, we have saved 119 – 28 = 91 bits.
Letter A B E R T U Y Space
Frequency 1 2 3 1 5 1 1 2
Step 1:
A R U Y
teachcomputerscience.com
29
Step
2:
B Sp
.
A R U Y
teachcomputerscience.com
30
Step
3:
E
B Sp
.
A R U Y
teachcomputerscience.com
31
Step roo
5: t
T Sp
.
B
E
A R U Y
teachcomputerscience.com
32
Letter A B E R T U Y Space
Encryption
▪ Encryption is the process of changing the data into another form or
code so that only people with access to a secret key can read it. For
others, the message will not be in a readable form.
▪ This technique is used in wireless networks for ensuring security.
▪ This technique is also used in https, which is the secured form of a
http webpage. The inputs from the user are encrypted to offer a
secure online experience during banking, shopping, etc.
teachcomputerscience.com
35
Caesar cypher
▪ A basic encryption algorithm is Caesar cypher where the alphabets are
displaced by a known amount.
▪ Example: Caesar code wherein the letters are displaced by 5 places.
▪ Using the above code, “INITIATE PLAN A” will be coded as “NSNYFYJ UQFS F”.
▪ The secret information used to encrypt or decrypt the message is called a key.
teachcomputerscience.com
36
teachcomputerscience.com
37
teachcomputerscience.com
38
▪ Vernam cypher works with the ASCII codes of characters. Each ASCII code is
taken in binary form.
▪ The one-time key is also taken in binary form.
▪ An XOR operation is performed between ASCII codes and the one-time key.
▪ The key is completely random, and its length is equal to or greater than the
length of original message.
▪ To decrypt the cypher text, an XOR operation is performed between the
cypher text and one-time pad.
teachcomputerscience.com
39
Encryption: Symmetric
teachcomputerscience.com
Step Sender Receiver 40
Method used to
generate and Sender and receiver both choose the same encryption algorithm.
1
distribute key Ex: 13XMOD5
Encryption: Asymmetric
teachcomputerscience.com
42
Keys
Public keys Private keys
Public keys are available to all users and Private keys, which are different from public
are used to encrypt a message. keys, are only available to the intended
recipient. These keys are used to decrypt the
message.
teachcomputerscience.com
43
teachcomputerscience.com
44
Bob’s public key
Alice Welcome
Encryption (Sender) Bob!
Ergfh34
y5u1
Message transmitted
Bob Welcome
(Receiver) Bob!
Decryption
Bob’s
private
key
Key Bob’s
Large public
making
number key
algorithm
teachcomputerscience.com
45
Hashing
▪ A hashing function maps input of arbitrary length to a fixed-length or a
smaller output.
▪ A hashing algorithm converts a text message into a string of hexadecimal
characters.
▪ Hashing functions are one-way functions. The encrypted messages cannot
be converted back to the original message.
▪ This is widely used to protect stored passwords and PINs from hackers. To
verify a password, the password entered by a user is applied to a hash
function. The result is verified with the stored password to grant access.
teachcomputerscience.com
46
Hashing
▪ Example: A 128-bit string is generated for any message that is encrypted
using the MD4 algorithm. MD4 is a cryptographic hash function.
▪ The text “Hello World!!! Welcome.” is converted as
“EBA941A5FD543A15919B803743868151”.
▪ The same message without a full-stop produces a different value in
MD4 algorithm, “92fc661dd222843f25f6c02517299a79”.
▪ So, it is difficult to decode a message encrypted with a hash function.
teachcomputerscience.com
47
Digital signature
▪ Online documents are authenticated using digital signatures.
▪ For creating a digital signature, we start with the hash total.
▪ Hash total is a mathematical value calculated from the hash function.
▪ Sender encrypts the hash total using his private key to create a digital
signature. This signature is combined with the original message.
▪ Now, this combined message is encrypted using the receiver’s public
key.
teachcomputerscience.com
48
Original
message Encrypted
Receiver’s message
public key with
Sender’s signature
Hash
Private
total
key
Digital signature
teachcomputerscience.com
49
Digital signature
▪ The receiver decrypts the message using his private key to obtain the
original message along with an encrypted digital signature.
▪ To decrypt the digital signature, the sender’s public key is used.
▪ Hash total is obtained by decrypting the hash total.
▪ Hash total is also calculated by the receiver using the original message.
▪ Both, the hash total values are compared. If they are the same, the
message is a genuine one without any modifications.
teachcomputerscience.com
50
Calculate Yes
Original
Encrypted Hash Equal?
message Genuine
message Receiver’s total
message
with private
signature key Sender’s
Hash
Signature Public
key total
teachcomputerscience.com
51
Digital signature
▪ In case of modifications of the original message, the calculated hash
total value would have changed.
▪ To improve security, the original date and time can also be included in
the original message.
▪ A digital certificate ensures that the sender’s public key belongs to a
trusted source.
▪ It is possible to create a fake signature with a bogus private key using
the sender’s public key and claiming to be that sender.
teachcomputerscience.com
52
Digital Certificates
▪ Certificate authorities such as Verisign issues certificate to websites and
senders claiming the correctness of public key.
▪ The certificate contains the name of the sender, his public key, and
expiry date along with a digital signature of the certificate authority.
teachcomputerscience.com
53
teachcomputerscience.com
54
teachcomputerscience.com
2.
Activity
teachcomputerscience.com
56
Activity-1
Duration: 15 minutes
1. Use Huffman coding to create a Huffman tree for the sentence: GOOD
MORNING GORDON. Also, state the character coding for each character.
Letter
Frequency
Binary
value
2. Using Huffman coding, how many bits have you saved?
teachcomputerscience.com
57
Activity-2
Duration: 15 minutes
teachcomputerscience.com
3.
End of topic questions
teachcomputerscience.com
59
teachcomputerscience.com
61
9. What do you mean by the terms public key and private key?
10. Use the encryption algorithm 5XMOD11 to distribute keys between a
sender and receiver.
11. Using a flowchart, explain how a receiver finds out whether the
received encrypted message with a signature is genuine or not.
teachcomputerscience.com