CNN

CNN are deep learning architectures that are primarily used for processing image data.
The special operation known as Convolution helps them extract features like edges and textures, in combination with filters.
ReLU is applied as a activation function to add non linearity
Pooling is perfomed to reduce the spatial dimensions while retainign important information. This is helpful in computational load and controlling overfitting
Fully Connected Layer (FCL) , after several convolution and pooling operations, the output is passes through a FCL to generate class probabilties needed for classification.

How CNNs Work:

The input image is transformed into a numerical representation, where each pixel is assigned a value based on its intensity.
The convolution operation involves sliding the filter across the image and performing element-wise multiplication, followed by summation to create a feature map.
As data progresses through multiple layers, CNNs learn increasingly complex features, from simple edges in early layers to intricate shapes in deeper layers.

Applications:

CNNs are widely used in various fields such as:

Image Recognition: Identifying objects in images (e.g., facial recognition).
Medical Image Analysis: Analyzing X-rays or MRIs for diagnostic purposes.
Autonomous Vehicles: Object detection and scene understanding.

RNN

RNN are a class of nerual networks that are excellent at processing sequential data
They maintain an internal state at time step t for input x(t), and combines it with hidden state from the previous step h(t-1) to produce a new hidden state h(t)
h (t) = f [ W(h) * h(t−1) + W(x) * x(t) + b], W(h) and W(x) are weight matrics and b is the bias term, f is the activation function

Applications:

RNNs are commonly used in:

Natural Language Processing: Tasks such as language modeling, text generation, and sentiment analysis.
Speech Recognition: Processing audio signals to convert speech into text.
Time Series Prediction: Forecasting stock prices or weather conditions based on historical data.

Decision Tree

Decision tree is a supervised ML algorithm used in classification and regression taks
It is able to model decision and possible consequences in the form of a tree like strcuture
The branch represents a decision rule and the internal node represents a feature. The leaf node or the terminal node of the branch is the outcome

Building a Decision Tree:

DEINR (pronounced as "Diner") : Data; Entropy; InformationGain ; NodeSeletion; RecursiveSplitting

Data Input: Start with the entire dataset.
Entropy Calculation: Calculate the entropy of the target variable and predictor attributes to measure impurity.
Information Gain: Determine the information gain for each attribute to identify which feature best splits the data.
Node Selection: Choose the attribute with the highest information gain as the root node.
Recursive Splitting: Repeat this process recursively for each branch until all branches are finalized or a stopping criterion is met (e.g., maximum depth or minimum samples per leaf)

Advantages:

Easy to interpret and visualize.
Requires little data preprocessing (no need for normalization).
Can handle both numerical and categorical data.

Disadvantages:

Prone to overfitting, especially with deep trees.
Sensitive to small variations in data.

Random Forest

Random Forest is an ensermble technique, that combines multiple decision trees
It mitigates overfitting by averaging the results of many tree, which indivudually may have high variance

Building a Random Forest:

BTA (pronounced as "beta"): BootStrapSampling; TreeConstruction; Aggregation

Bootstrap Sampling: Randomly select subsets of the training data with replacement to create multiple datasets.
Tree Construction: For each subset, build a decision tree using a random selection of features at each split.
Aggregation: During prediction, aggregate the results from all trees (e.g., majority vote for classification or average for regression)

Advantages:

Reduces overfitting compared to individual decision trees.
Handles large datasets with higher dimensionality well.
Provides feature importance scores.

Disadvantages:

More complex and less interpretable than single decision trees.
Requires more computational resources.

Bagging or (B)ootstrap (Agg)regating

This is an ensemble technique aimed at improving the accuracy and stability of ML models
It is done by combining multiple models trained on different subsets of the training data

How Bagging Works:

Multiple Samples: Generate multiple bootstrap samples from the original dataset.
Model Training: Train a separate model (e.g., decision tree) on each bootstrap sample.
Final Prediction: Aggregate predictions from all models (e.g., majority voting for classification, averaging for regression)

Advantages:

Reduces variance and helps prevent overfitting.
Improves model robustness against noise in data.

Disadvantages:

May not significantly improve performance if base learners are not diverse.

Boosting

This is an ensemble technique aimed at improving the accuracy and stability of ML models
It is done by combining weak learners(models that perfrom slightly better than random chance) to create a strong learner
The strong learner is built in iterations with focus on misclassified instances.

How Boosting Works:

Sequential Learning: Models are trained sequentially, where each new model focuses on correcting errors made by previous models.
Weight Adjustment: Misclassified instances are given higher weights so that subsequent models pay more attention to them.
Final Prediction: Combine predictions from all models, typically using weighted voting or averaging

Popular Boosting Algorithms:

AdaBoost
Gradient Boosting
XGBoost

Advantages:

Often achieves high accuracy and performs well even with limited data.
Can handle various types of data and relationships.

Disadvantages:

More prone to overfitting than bagging if not carefully tuned.
Requires careful tuning of parameters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GeneralMLPrep.md

GeneralMLPrep.md

CNN

How CNNs Work:

Applications:

RNN

Applications:

Decision Tree

Building a Decision Tree:

Advantages:

Disadvantages:

Random Forest

Building a Random Forest:

Advantages:

Disadvantages:

Bagging or (B)ootstrap (Agg)regating

How Bagging Works:

Advantages:

Disadvantages:

Boosting

How Boosting Works:

Popular Boosting Algorithms:

Advantages:

Disadvantages:

Files

GeneralMLPrep.md

Latest commit

History

GeneralMLPrep.md

File metadata and controls

CNN

How CNNs Work:

Applications:

RNN

Applications:

Decision Tree

Building a Decision Tree:

Advantages:

Disadvantages:

Random Forest

Building a Random Forest:

Advantages:

Disadvantages:

Bagging or (B)ootstrap (Agg)regating

How Bagging Works:

Advantages:

Disadvantages:

Boosting

How Boosting Works:

Popular Boosting Algorithms:

Advantages:

Disadvantages: