Social media users frequently encounter abuse, harassment, and insults from other users on a majority of online communication platforms like Facebook, Instagram and Youtube due to which many users stop expressing their ideas and opinions.
What is the solution?
The solution to this problem is to create an effective model that can identify the level of toxicity in comments such as threats, obscenity, insults, racism, etc. Thereby, promoting a peaceful environment for online dialogue.
In this article, we will understand more about Toxic comment multi-label classification and create a model to classify comments into various labels of toxicity.
What is Toxic comment classification?
The toxicity class refers to any comment or text containing offensive or hurtful words. This can involve insults, slurs or other offensive language.
Every supervised classification technique can be further subdivided into three groups based on the number of categories it uses:
1. Binary classification:
It is a type of supervised machine-learning problem that classifies data into two mutually exclusive groups or categories. The two categories can be classified as true and false, 0 and 1, positive and negative, etc.
In toxic comment classification, the model is trained to predict whether a comment is toxic (class 1) or non-toxic (class 0).
Example:
"I hate you!" Predicted class: Toxic (class 1)
"I like you!" Predicted class: Non-toxic (class 0)
2. Multiclass classification:
It is a type of supervised machine-learning problem that classifies data into three or more groups/categories.
A multiclass classifier for Toxic comment classification is trained to detect various degrees of toxicity in comments, such as mild toxicity, severe toxicity, and non-toxic comments, as opposed to just differentiating between toxic and non-toxic comments (binary classification).
Example:
"I want to kill you!" Predicted class: Severe toxicity
"You are so ugly and unconfident" Predicted class: Mild toxicity
"You are a good person" Predicted class: Non-toxic
3. Multilabel classification: Multilabel classification is a supervised machine learning approach where a single instance can be associated with multiple labels simultaneously. It allows the model to assign zero, one, or more labels to each data sample based on its characteristics.
In the context of toxic comment classification, a comment or text can be labelled with multiple toxicity categories if it contains various forms of harmful language.
Example:
"You're an idiot person, and I hope someone hits you!"
Multiple Labels: Offensive language (class 1), Threats (class 1), hatred (class1), non_toxic(class 0)
Toxic Comment Classification using BERT
Let's get started!
About the dataset:
We have a large number of Wikipedia comments which have been labelled by human raters for toxic behaviour. The dataset variables are:
- toxic
- severe_toxic
- obscene
- threat
- insult
- identity_hate
Access the dataset: Toxic Comments dataset
Now, the coding part begins!
Prerequisite
Utilizing PyTorch with transformers, for a more flexible and intuitive interface for building and training deep learning models
!pip install torchTransformers for using BERT(Bidirectional Encoder Representations from Transformers)
!pip install transformersImporting necessary libraries
import numpy as np
import pandas as pd
#data visualisation libraries
import matplotlib.pyplot as plt
import seaborn as sns
from pylab import rcParams
import torch
from torch.utils.data import DataLoader, TensorDataset
from transformers import BertTokenizer, BertForSequenceClassification, AdamW
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_score, recall_score
#to avoid warnings
import warnings
warnings.filterwarnings('ignore')
Load the datasets
data = pd.read_csv("toxicity.csv")
print(data.head())
Output:
id comment_text toxic \
0 0000997932d777bf Explanation\nWhy the edits made under my usern... 0
1 000103f0d9cfb60f D'aww! He matches this background colour I'm s... 0
2 000113f07ec002fd Hey man, I'm really not trying to edit war. It... 0
3 0001b41b1c6bb37e "\nMore\nI can't make any real suggestions on ... 0
4 0001d958c54c6e35 You, sir, are my hero. Any chance you remember... 0
severe_toxic obscene threat insult identity_hate
0 0 0 0 0 0
1 0 0 0 0 0
2 0 0 0 0 0
3 0 0 0 0 0
4 0 0 0 0 0
Data Visualization to Understand Class Distribution
# Visualizing the class distribution of the 'label' column
column_labels = data.columns.tolist()[2:]
label_counts = data[column_labels].sum().sort_values()
# Create a black background for the plot
plt.figure(figsize=(7, 5))
# Create a horizontal bar plot using Seaborn
ax = sns.barplot(x=label_counts.values,
y=label_counts.index, palette='viridis')
# Add labels and title to the plot
plt.xlabel('Number of Occurrences')
plt.ylabel('Labels')
plt.title('Distribution of Label Occurrences')
# Show the plot
plt.show()
Output:
.png)
Checking exact values for each class
data[column_labels].sum().sort_values()
Output:
threat 478
identity_hate 1405
severe_toxic 1595
insult 7877
obscene 8449
toxic 15294
dtype: int64
Toxic and Non-Toxic Data
Let's check if the data is balanced or not by comparing toxic and clean comments by creating their subsets, and then create a new data frame to visualize and gain insights on the distribution of the dataset.
# Create subsets based on toxic and clean comments
train_toxic = data[data[column_labels].sum(axis=1) > 0]
train_clean = data[data[column_labels].sum(axis=1) == 0]
# Number of toxic and clean comments
num_toxic = len(train_toxic)
num_clean = len(train_clean)
# Create a DataFrame for visualization
plot_data = pd.DataFrame(
{'Category': ['Toxic', 'Clean'], 'Count': [num_toxic, num_clean]})
# Create a black background for the plot
plt.figure(figsize=(7, 5))
# Horizontal bar plot
ax = sns.barplot(x='Count', y='Category', data=plot_data, palette='viridis')
# Add labels and title to the plot
plt.xlabel('Number of Comments')
plt.ylabel('Category')
plt.title('Distribution of Toxic and Clean Comments')
# Set ticks' color to white
ax.tick_params()
# Show the plot
plt.show()
Output:
.png)
We can observe that our dataset is severely imbalanced.
Let's have a look at the proportion of toxic and clean comments in numbers in order to know the exact numbers and balance the data accordingly.
print(train_toxic.shape)
print(train_clean.shape)
Output:
(16225, 8)
(143346, 8)
There is a huge difference in the dataset between toxic and clean comments.
Handling class imbalance
To handle the imbalanced data, we can create a new training set in which the number of toxic comments remains the same, and to match that, we will randomly sample 16,225 clean comments and include them in the training set.
The new balanced data frame
# Randomly sample 15,000 clean comments
train_clean_sampled = train_clean.sample(n=16225, random_state=42)
# Combine the toxic and sampled clean comments
dataframe = pd.concat([train_toxic, train_clean_sampled], axis=0)
# Shuffle the data to avoid any order bias during training
dataframe = df.sample(frac=1, random_state=42)
let's verify with actual figures
print(train_toxic.shape)
print(train_clean_sampled.shape)
print(dataframe.shape)
Output:
(16225, 8)
(16225, 8)
(32450, 8)
Now, the dataset is balanced with exactly equal instances of toxic and clean comments we can proceed further to tokenizing and encoding comments using BertTokenizer.
Split Data into Training, Validation, and Testing Sets
In this step, we split the data into training, validation, and testing sets. The data is divided into training and testing sets first, and then the testing set is further split into validation and testing sets.
# Split data into training, testing sets & validation sets
train_texts, test_texts, train_labels, test_labels = train_test_split(
dataframe['comment_text'], dataframe.iloc[:, 2:], test_size=0.25, random_state=42)
Now, we split the validation set
# validation set
test_texts, val_texts, test_labels, val_labels = train_test_split(
test_texts, test_labels, test_size=0.5, random_state=42)
Now, we will tokenize and encode the comments and labels for the training, testing, and validation sets.
Tokenization and Encoding
Defining 'tokenize_and_encode' function to perform this task
# Token and Encode Function
def tokenize_and_encode(tokenizer, comments, labels, max_length=128):
# Initialize empty lists to store tokenized inputs and attention masks
input_ids = []
attention_masks = []
# Iterate through each comment in the 'comments' list
for comment in comments:
# Tokenize and encode the comment using the BERT tokenizer
encoded_dict = tokenizer.encode_plus(
comment,
# Add special tokens like [CLS] and [SEP]
add_special_tokens=True,
# Truncate or pad the comment to 'max_length'
max_length=max_length,
# Pad the comment to 'max_length' with zeros if needed
pad_to_max_length=True,
# Return attention mask to mask padded tokens
return_attention_mask=True,
# Return PyTorch tensors
return_tensors='pt'
)
# Append the tokenized input and attention mask to their respective lists
input_ids.append(encoded_dict['input_ids'])
attention_masks.append(encoded_dict['attention_mask'])
# Concatenate the tokenized inputs and attention masks into tensors
input_ids = torch.cat(input_ids, dim=0)
attention_masks = torch.cat(attention_masks, dim=0)
# Convert the labels to a PyTorch tensor with the data type float32
labels = torch.tensor(labels, dtype=torch.float32)
# Return the tokenized inputs, attention masks, and labels as PyTorch tensors
return input_ids, attention_masks, labels
Initialize Tokenizer and Model
Now, we will Initialize the BERT tokenizer with the 'bert-base-uncased' model
# Token Initialization
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased',
do_lower_case=True)
Initialize BERT classification Model
After this step, we will initialize the BERT model for sequence classification
# Model Initialization
model = BertForSequenceClassification.from_pretrained('bert-base-uncased',
num_labels=6)
Now, an additional step for faster processing of the model. You can move the model to the GPU if available, or to the CPU if not.
# Move model to GPU if available
device = torch.device(
'cuda') if torch.cuda.is_available() else torch.device('cpu')
model = model.to(device)
Apply Tokenization and Encoding
Tokenize and Encode the comments and labels of the train, test and validation set
# Tokenize and Encode the comments and labels for the training set
input_ids, attention_masks, labels = tokenize_and_encode(
tokenizer,
train_texts,
train_labels.values
)
# Tokenize and Encode the comments and labels for the test set
test_input_ids, test_attention_masks, test_labels = tokenize_and_encode(
tokenizer,
test_texts,
test_labels.values
)
# Tokenize and Encode the comments and labels for the validation set
val_input_ids, val_attention_masks, val_labels = tokenize_and_encode(
tokenizer,
val_texts,
val_labels.values
)
print('Training Comments :',train_texts.shape)
print('Input Ids :',input_ids.shape)
print('Attention Mask :',attention_masks.shape)
print('Labels :',labels.shape)
Output:
Training Comments : (22715,)
Input Ids : torch.Size([22715, 128])
Attention Mask : torch.Size([22715, 128])
Labels : torch.Size([22715, 6])Let's check an encoded text with the corresponding text and labels
k = 53
print('Training Comments -->>',train_texts.values[k])
print('\nInput Ids -->>\n',input_ids[k])
print('\nDecoded Ids -->>\n',tokenizer.decode(input_ids[k]))
print('\nAttention Mask -->>\n',attention_masks[k])
print('\nLabels -->>',labels[k])
Output:
Training Comments -->> I have edited the text and wrote with neutral information. Please suggest what went wrong.
Input Ids -->>
tensor([ 101, 1045, 2031, 5493, 1996, 3793, 1998, 2626, 2007, 8699, 2592, 1012,
3531, 6592, 2054, 2253, 3308, 1012, 102, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0])
Decoded Ids -->>
[CLS] i have edited the text and wrote with neutral information. please suggest what went wrong. [SEP]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD]
Attention Mask -->>
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0])
Labels -->> tensor([0., 0., 0., 0., 0., 0.])Creating Pytorch Data Loaders
Now, we will create data loaders to efficiently load the data during training, testing, and validation. The data loaders batch the input data and handle shuffling for the training data.
# Creating DataLoader for the balanced dataset
batch_size = 32
train_dataset = TensorDataset(input_ids, attention_masks, labels)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
# testing set
test_dataset = TensorDataset(test_input_ids, test_attention_masks, test_labels)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
# validation set
val_dataset = TensorDataset(val_input_ids, val_attention_masks, val_labels)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
Let's check the train_loader data
print('Batch Size :',train_loader.batch_size)
Batch =next(iter(train_loader))
print('Each Input ids shape :',Batch[0].shape)
print('Input ids :\n',Batch[0][0])
print('Corresponding Decoded text:\n',tokenizer.decode(Batch[0][0]))
print('Corresponding Attention Mask :\n',Batch[1][0])
print('Corresponding Label:',Batch[2][0])
Output:
Batch Size : 32
Each Input ids shape : torch.Size([32, 128])
Input ids :
tensor([ 101, 2175, 3280, 1999, 1037, 2543, 1012, 1045, 2123, 2102,
2228, 3087, 2106, 2062, 4053, 2000, 16948, 2059, 2017, 1999,
1996, 2197, 2048, 2086, 1012, 9119, 1010, 3246, 2017, 2123,
2102, 2272, 2067, 2007, 1037, 28407, 13997, 1006, 2029, 2017,
2471, 5121, 2097, 999, 999, 999, 1007, 6109, 1012, 6564,
1012, 2382, 1012, 19955, 102, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0])
Corresponding Decoded text:
[CLS] go die in a fire. i dont think anyone did more damage to wikipedia then you in the last two years. goodbye,
hope you dont come back with a sock puppet ( which you almost certainly will!!! ) 93. 86. 30. 194 [SEP]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD]
Corresponding Attention Mask :
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0])
Corresponding Label: tensor([1., 0., 1., 0., 1., 0.])Initializes the optimizer for training the model.
AdamW optimizer: We are using AdamW optimizer which refers to Adaptive Moment Estimation. It combines the advantages of RMSprop (Root Mean Square Propagation) and AdaGrad (Adaptive Gradient Algorithm), two additional optimization strategies.
For each model parameter, it includes moving averages of the gradient and the squared gradient, which aid in adjusting the learning rates for various parameters during training.
# Optimizer setup
optimizer = AdamW(model.parameters(), lr=2e-5)
Model Training
# Function to Train the Model
def train_model(model, train_loader, optimizer, device, num_epochs):
# Loop through the specified number of epochs
for epoch in range(num_epochs):
# Set the model to training mode
model.train()
# Initialize total loss for the current epoch
total_loss = 0
# Loop through the batches in the training data
for batch in train_loader:
input_ids, attention_mask, labels = [t.to(device) for t in batch]
optimizer.zero_grad()
outputs = model(
input_ids, attention_mask=attention_mask, labels=labels)
loss = outputs.loss
total_loss += loss.item()
loss.backward()
optimizer.step()
model.eval() # Set the model to evaluation mode
val_loss = 0
# Disable gradient computation during validation
with torch.no_grad():
for batch in val_loader:
input_ids, attention_mask, labels = [
t.to(device) for t in batch]
outputs = model(
input_ids, attention_mask=attention_mask, labels=labels)
loss = outputs.loss
val_loss += loss.item()
# Print the average loss for the current epoch
print(
f'Epoch {epoch+1}, Training Loss: {total_loss/len(train_loader)},Validation loss:{val_loss/len(val_loader)}')
# Call the function to train the model
train_model(model, train_loader, optimizer, device, num_epochs=3)
Output:
Epoch 1, Training Loss: 0.20543626952968852,Validation loss:0.1643741050479459
Epoch 2, Training Loss: 0.13793433358971502,Validation loss:0.14861836971021167
Epoch 3, Training Loss: 0.11418234390587034,Validation loss:0.1539663544862099
Model Evaluation
let's evaluate the model now
# Evaluate the Model
def evaluate_model(model, test_loader, device):
model.eval() # Set the model to evaluation mode
true_labels = []
predicted_probs = []
with torch.no_grad():
for batch in test_loader:
input_ids, attention_mask, labels = [t.to(device) for t in batch]
# Get model's predictions
outputs = model(input_ids, attention_mask=attention_mask)
# Use sigmoid for multilabel classification
predicted_probs_batch = torch.sigmoid(outputs.logits)
predicted_probs.append(predicted_probs_batch.cpu().numpy())
true_labels_batch = labels.cpu().numpy()
true_labels.append(true_labels_batch)
# Combine predictions and labels for evaluation
true_labels = np.concatenate(true_labels, axis=0)
predicted_probs = np.concatenate(predicted_probs, axis=0)
predicted_labels = (predicted_probs > 0.5).astype(
int) # Apply threshold for binary classification
# Calculate evaluation metrics
accuracy = accuracy_score(true_labels, predicted_labels)
precision = precision_score(true_labels, predicted_labels, average='micro')
recall = recall_score(true_labels, predicted_labels, average='micro')
# Print the evaluation metrics
print(f'Accuracy: {accuracy:.4f}')
print(f'Precision: {precision:.4f}')
print(f'Recall: {recall:.4f}')
# Call the function to evaluate the model on the test data
evaluate_model(model, test_loader, device)
Output:
Accuracy: 0.7099
Precision: 0.8059
Recall: 0.8691
Now, we can evaluate the model based on the metrics results achieved here.
Save the Model
# Save the tokenizer and model in the same directory
output_dir = "Saved_model"
# Save model's state dictionary and configuration
model.save_pretrained(output_dir)
# Save tokenizer's configuration and vocabulary
tokenizer.save_pretrained(output_dir)
Now, load the model
Load the Model
# Load the tokenizer and model from the saved directory
model_name = "Saved_model"
Bert_Tokenizer = BertTokenizer.from_pretrained(model_name)
Bert_Model = BertForSequenceClassification.from_pretrained(
model_name).to(device)
Now, comes the interesting part!
Prediction
let's predict user input
def predict_user_input(input_text, model=Bert_Model, tokenizer=Bert_Tokenizer, device=device):
user_input = [input_text]
user_encodings = tokenizer(
user_input, truncation=True, padding=True, return_tensors="pt")
user_dataset = TensorDataset(
user_encodings['input_ids'], user_encodings['attention_mask'])
user_loader = DataLoader(user_dataset, batch_size=1, shuffle=False)
model.eval()
with torch.no_grad():
for batch in user_loader:
input_ids, attention_mask = [t.to(device) for t in batch]
outputs = model(input_ids, attention_mask=attention_mask)
logits = outputs.logits
predictions = torch.sigmoid(logits)
predicted_labels = (predictions.cpu().numpy() > 0.5).astype(int)
labels_list = ['toxic', 'severe_toxic', 'obscene',
'threat', 'insult', 'identity_hate']
result = dict(zip(labels_list, predicted_labels[0]))
return result
text = 'Are you insane!'
predict_user_input(input_text=text)
Output:
{'toxic': 1,
'severe_toxic': 0,
'obscene': 0,
'threat': 0,
'insult': 0,
'identity_hate': 0}
We can observe that the comment 'Are you insane!' is a toxic comment.
let's check for more inputs
predict_user_input(input_text='How are you?')
Output:
{'toxic': 0,
'severe_toxic': 0,
'obscene': 0,
'threat': 0,
'insult': 0,
'identity_hate': 0}
Well, obviously the comment 'How are you?' is not toxic, hence all the other label values are 0
text = "Such an Idiot person"
predict_user_input(model=Bert_Model,
tokenizer=Bert_Tokenizer,
input_text=text,
device=device)
Output:
{'toxic': 1,
'severe_toxic': 0,
'obscene': 1,
'threat': 0,
'insult': 1,
'identity_hate': 0}
As we can see, the comment "Such an Idiot person" shows true for labels toxic, obscene and insult which is right. It is definitely not a threat or identity threat so those values come out to be 0.