In machine learning, evaluating model performance is critical. Three widely used metrics—Precision, Recall, and F1-Score—help assess the quality of classification models. Here's what each metric represents:
- Recall: Measures the proportion of actual positive cases correctly identified. Also known as sensitivity or true positive rate (TPR).
Formula:\text{Recall} = \frac{\text{True Positives (TP)}}{\text{True Positives (TP)} + \text{False Negatives (FN)}} - Precision: Represents the proportion of positive predictions that are correct.
Formula:\text{Precision} = \frac{\text{True Positives (TP)}}{\text{True Positives (TP)} + \text{False Positives (FP)}} - F1-Score: The harmonic mean of Precision and Recall, offering a balance between the two.
Formula:\text{F1-Score} = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}
Calculating Precision, Recall and F1 Score in R
R provides robust tools to compute these metrics easily. Below are step-by-step instructions using caret and Metrics packages.
Step 1: Install and Load Required Packages
To compute Recall and related metrics, first install the necessary R packages.
install.packages("caret")
install.packages("Metrics")
Load the packages:
library(caret)
library(Metrics)
Step 2: Compute Recall Using the caret Package
The caret::confusionMatrix() function computes a confusion matrix along with Recall and other metrics. Here's an example:
# Create predicted and actual class labels
predicted <- c(1, 1, 1, 0, 0)
actual <- c(1, 0, 1, 1, 1)
# Generate confusion matrix
xtab <- table(predicted, actual)
cm <- caret::confusionMatrix(xtab)
print(cm)
Output
Confusion Matrix and Statistics
actual
predicted 0 1
0 0 2
1 1 2
Accuracy : 0.4
95% CI : (0.0527, 0.8534)
No Information Rate : 0.8
P-Value [Acc > NIR] : 0.9933
Kappa : -0.3636
Mcnemar's Test P-Value : 1.0000
Sensitivity : 0.0000
Specificity : 0.5000
Pos Pred Value : 0.0000
Neg Pred Value : 0.6667
Prevalence : 0.2000
Detection Rate : 0.0000
Detection Prevalence : 0.4000
Balanced Accuracy : 0.2500
'Positive' Class : 0
Step 3: Compute Metrics Using the Metrics Package
The Metrics package offers functions like recall(), precision(), and f1_score() for quick calculation.
# Install the necessary packages if not installed
install.packages("Metrics")
library(Metrics)
# Predicted and actual labels
predicted <- c(1, 1, 1, 0, 0)
actual <- c(1, 0, 1, 1, 1)
# Compute Recall (correct order: actual, predicted)
recall_score <- recall(actual, predicted)
# Compute Precision (correct order: actual, predicted)
precision_score <- precision(actual, predicted)
# Compute F1-Score (correct order: actual, predicted)
f1_score <- f1(actual, predicted)
# Print results
cat("Recall:", recall_score, "\n")
cat("Precision:", precision_score, "\n")
cat("F1-Score:", f1_score, "\n")
Output
Recall: 0.5
Precision: 0.6666667
F1-Score: 1
- Recall is for applications like disease detection or fraud detection where missing positive cases is costly.
- Precision is for spam filters or recommendation systems where false positives should be minimized.
- F1-Score balances precision and recall, making it ideal for imbalanced datasets.
R programming makes it simple to compute metrics like Precision, Recall, and F1-Score using packages like caret and Metrics. These metrics are indispensable for evaluating and improving model performance, particularly for binary classification problems.