Naive Bayes Classifier is a machine learning algorithm used to classify data into categories. It uses Bayes' Theorem to calculate the probability of each class based on the input features. It assumes that all features are independent of each other.
Bayes’ Theorem Formula
Naive Bayes algorithm is based on Bayes theorem. Bayes theorem gives the conditional probability of an event A given another event B has occurred.
P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)}
Where:
- P(A|B) = Conditional probability of A given B.
- P(B|A) = Conditional probability of B given A.
- P(A) = Probability of event A.
- P(B) = Probability of event B.
For many predictors, we can formulate the posterior probability as follows:
P(A \mid B) = P(B_1 \mid A) \cdot P(B_2 \mid A) \cdot P(B_3 \mid A) \cdot P(B_4 \mid A) \cdots
Example Using Bayes’ Theorem
Consider a sample space: {HH, HT, TH, TT}
where, H = Head, T = Tail
We are asked to find the probability that the second coin is a Head given that the first coin is a Tail.
- The event A: Second coin is a Head
- The event B: First coin is a Tail
- P(A | B) is the conditional probability we want to find.
- P(B | A) is the probability of the first coin being a Tail, given that the second coin is a Head.
- P(A) is the probability of the second coin being a Head (which is 1/2, because the outcome of one coin does not affect the other).
- P(B) is the probability of the first coin being a Tail which is also 1/2.
Now applying Bayes’ Theorem:
P(A \mid B) = \frac{(1/2) \cdot (1/2)}{1/2}= \frac{1/4}{1/2}= \frac{1}{2}= 0.5
Therefore, the probability that the second coin is a Head, given that the first coin is a Tail, is 0.5.
Implementation of Naive Bayes Classifier
We follow these steps to build and evaluate a Naive Bayes model using the Iris dataset.
1. Installing and Load Required Packages
We install the necessary packages and load them.
- e1071: Contains Naive Bayes classifier (naiveBayes()) and other useful machine learning functions.
- caTools: Provides utilities for data splitting (for training and test sets).
- caret: Simplifies machine learning tasks like training models, evaluating them and creating confusion matrices.
- library(): This function loads the installed packages into the R environment, allowing their functions to be used.
install.packages("e1071")
install.packages("caTools")
install.packages("caret")
library(e1071)
library(caTools)
library(caret)
2. Loading the Dataset
We begin by loading the dataset and checking its structure.
- data(): Loads a dataset into R. For example, the iris dataset which contains information about Iris flower species (sepal and petal length and width).
- head(): Displays the first few rows (default is 6) of the dataset for a quick overview.
data(iris)
head(iris)
Output:

3. Splitting the Dataset
We split the data into training and testing sets using a 70:30 ratio.
- set.seed(): Ensures reproducibility by setting the seed for the random number generator.
- sample.split(): From the caTools package, it splits the data. The SplitRatio argument defines the proportion for training data (e.g., 70%). It returns a logical vector indicating rows in the training set.
- subset(): Creates subsets of the iris dataset, used to generate the train_cl and test_cl datasets based on the split.
set.seed(123)
split <- sample.split(iris, SplitRatio = 0.7)
train_cl <- subset(iris, split == TRUE)
test_cl <- subset(iris, split == FALSE)
4. Scaling the Features
We scale the numerical features to normalize the data.
- scale(): Standardizes the dataset by transforming the numeric columns (1 to 4, corresponding to the features of the iris dataset) so that each feature has a mean of 0 and a standard deviation of 1.
train_scale <- scale(train_cl[, 1:4])
test_scale <- scale(test_cl[, 1:4])
5. Training the Naive Bayes Model
We train the Naive Bayes classifier using the training set.
- naiveBayes(): From the e1071 package, this function trains the Naive Bayes classifier. The Species ~ . formula indicates that we are predicting Species based on all other variables in the dataset (the dot represents all other columns).
- The trained model is stored in the classifier_cl variable.
classifier_cl <- naiveBayes(Species ~ ., data = train_cl)
classifier_cl
Output:

6. Making Predictions
We use the trained model to predict species on the test data.
- predict(): Uses the trained classifier to predict the target variable (Species) for the test data (test_cl). The predicted values are stored in the y_pred variable.
y_pred <- predict(classifier_cl, newdata = test_cl)
7. Evaluating the Model
We create a confusion matrix and evaluate the model performance.
- table(): Creates a confusion matrix by comparing the true class labels (test_cl$Species) with the predicted class labels (y_pred).
- confusionMatrix(): From the caret package, this function calculates metrics like accuracy, precision, recall and F1-score from the confusion matrix.
cm <- table(test_cl$Species, y_pred)
confusionMatrix(cm)
Output:

The output shows that the Naive Bayes model achieved 95% accuracy, with strong performance across all classes, though some misclassifications occurred between Versicolor and Virginica.