AI_UNIT_3
AI_UNIT_3
SYLLABUS:
Introduction to machine learning – Linear Regression Models: Least
squares, single & multiple variables, Bayesian linear regression,
gradient descent, Linear Classification Models: Discriminant function –
Probabilistic discriminative model - Logistic regression, Probabilistic
generative model – Naive Bayes, Maximum margin classifier – Support
vector machine, Decision Tree, Random forests
PART A
1. Define Machine Learning.
• Arthur Samuel, an early American leader in the field of computer gaming
and artificial intelligence, coined the term “Machine Learning ” in 1959
while at IBM.
• He defined machine learning as “the field of study that gives computers
the ability to learn without being explicitly programmed “.
• Machine learning is programming computers to optimize a performance
criterion using example data or past experience. The model may be
predictive to make predictions in the future, or descriptive to gain
knowledge from data.
1
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
9. Define Classification.
• The Classification algorithm is a Supervised Learning technique that is
used to identify the category of new observations on the basis of training
data.
• In Classification, a program learns from the given dataset or observations
and then classifies new observation into a number of classes or groups.
2
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
• Such as, Yes or No, 0 or 1, Spam or Not Spam, cat or dog, etc. Classes can
be called as targets/labels or categories.
•
•
Where,
m: Slope
c: y-intercept
3
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
where
Y´ represents the predicted value;
X represents the known value;
b and a represent numbers calculated from the original correlation
analysis
4
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
PART B
5
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
1.1.3 Examples
1.1.3.1 Handwriting recognition learning problem
• Task T : Recognizing and classifying handwritten words
within images
• Performance P : Percent of words correctly classified
• Training experience E : A dataset of handwritten words with
given classifications
1.1.3.2 A robot driving learning problem
• Task T : Driving on highways using vision sensors
• Performance P : Average distance traveled before an error
• Training experience E : A sequence of images and steering
commands recorded while observing a human driver
6
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
1.3.1 Classification:
• The Classification algorithm is a Supervised Learning technique
that is used to identify the category of new observations on the
basis of training data.
• In Classification, a program learns from the given dataset or
observations and then classifies new observation into a number
of classes or groups.
• Such as, Yes or No, 0 or 1, Spam or Not Spam, cat or dog, etc.
Classes can be called as targets/labels or categories.
1.3.1 Regression:
• Regression is a supervised learning technique which helps in
finding the correlation between variables and enables us to
predict the continuous output variable based on the one or more
predictor variables.
7
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
1.3.2 Clustering:
• Clustering or cluster analysis is a machine learning technique,
which groups the unlabeled dataset.
• It can be defined as "A way of grouping the data points into
different clusters, consisting of similar data points.
• The objects with the possible similarities remain in a group that
has less or no similarities with another group."
Where,
m: Slope
c: y-intercept
8
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
where
Y´ represents the predicted value;
X represents the known value;
b and a represent numbers calculated from the original correlation
analysis
9
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
10
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
Step 6: Calculate the slope and the y-intercept using the formulas
# Calculating 'm' and 'c'
num = 0
denom = 0
for i in range(n):
num += (X[i] - mean_x) * (Y[i] - mean_y)
denom += (X[i] - mean_x) ** 2
m = num / denom
c = mean_y - (m * mean_x)
# Printing coefficients
print("Coefficients")
print(m, c)
11
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
Output:
12
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
Mathematical Approach:
Residual/Error = Actual values – Predicted Values
Sum of Residuals/Errors = Sum(Actual- Predicted Values)
Square of Sum of Residuals/Errors = (Sum(Actual- Predicted Values))2
13
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
where
o Posterior: It is the probability of an event to occur; say, H, given that
another event; say, E has already occurred. i.e., P(H | E).
o Prior: It is the probability of an event H has occurred prior to another
event. i.e., P(H)
o Likelihood: It is a likelihood function in which some parameter
variable is marginalized.
Program
fromsklearn.datasets import load_boston
fromsklearn.model_selection import train_test_split
fromsklearn.metrics import r2_score
fromsklearn.linear_model import BayesianRidge
14
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
Output
Test Set r2 score : 0.7943355984883815
15
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
• The goal is to minimize the cost as much as possible in order to find the
best fit line.
16
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
Learning Rate
• A learning rate is used for each pair of input and output values. It is a
scalar factor and coefficients are updated in direction towards minimizing
error.
• The process is repeated until a minimum sum squared error is achieved or
no further improvement is possible.
17
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
• Repeat this process until our Cost function is very small (ideally 0).
18
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
19
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
20
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
4. Explain in detail about Linear Discriminant Functions and its types. Also
elaborate about logistic regression in detail.
21
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
(3.1)
where w is the weight vector and w0 the bias or threshold weight.
(3.2)
or
(3.3)
22
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
• In figure 3.7, the hyperplane H divides the feature space into two
half-spaces:
o Decisionregion R1 for w1
o region R2 for w2.
• The discriminant function g(x) gives an algebraic measure of the
distance from x to the hyperplane.
• The way to express x as
(3.4)
where xp is the normal projection of x onto H, and r is the desired
algebraic distance which is positive if x is on the positive side and
negative if x is on the negative side. Then, because g(xp)=0,
or
(3.6)
o If w0=0, then g(x) has the homogeneous form , and the hyperplane
passes through the origin
23
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
(3.7)
(3.8)
24
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
(3.9)
• For the multi-class case, the posterior probability of class Ckis given by a
softmax transformation of a linear function of x
25
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
26
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
27
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
28
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
29
Figure 3.10 – Example for Support Vectors