Tutorial For K Means Clustering in Python Sklearn - MLK - Machine Learning Knowledge-3

This document is a tutorial on using K-Means clustering in Python's Sklearn library. It demonstrates finding the optimal number of clusters (K) through the elbow method by plotting the within-cluster sum of squared errors against K values ranging from 2 to 12. The elbow method suggests 5 or 6 clusters. It also calculates silhouette scores for different K values, finding the highest score at K=5, further indicating the dataset has 6 proper clusters. The tutorial uses principal component analysis to reduce the dimensionality before applying K-Means clustering.

Uploaded by

jefferyleclerc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views

Tutorial For K Means Clustering in Python Sklearn - MLK - Machine Learning Knowledge-3

Uploaded by

jefferyleclerc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

3/12/24, 2:43 PM Tutorial for K Means Clustering in Python Sklearn - MLK - Machine Learning Know ledge

principal component 1 principal component 2

0 -0.192221 0.319683

1 -0.458175 -0.018152

2 0.052562 0.551854

3 -0.402357 -0.014239

4 -0.031648 0.155578

Finding Optimum Value of K

i) Elbow Method with Within-Cluster-Sum of Squared Error
(WCSS)
Let us again use the elbow method with Within-Cluster-Sum of Squared Error (WCSS) to
determine the optimum value of K. From the graph it looks like there is a bend between 5 and
6.
In [16]:

K=range(2,12)
wss = []
for k in K:
kmeans=cluster.KMeans(n_clusters=k)
kmeans=kmeans.fit(pca_df)
wss_iter = kmeans.inertia_
wss.append(wss_iter)

In [17]:

https://machinelearningknow ledge.ai/tutorial-for-k-means-clustering-in-python-sklearn/ 27/35

3/12/24, 2:43 PM Tutorial for K Means Clustering in Python Sklearn - MLK - Machine Learning Know ledge

plt.xlabel('K')
plt.ylabel('Within-Cluster-Sum of Squared Errors (WSS)')
plt.plot(K,wss)
Out[17]:

···

https://machinelearningknow ledge.ai/tutorial-for-k-means-clustering-in-python-sklearn/ 28/35

3/12/24, 2:43 PM Tutorial for K Means Clustering in Python Sklearn - MLK - Machine Learning Know ledge

ii) The Silhouette Method

Using the Silhouette method, it can be seen that the Silhouette value is maximum for K=5.
Hence it can be concluded that the dataset can be segmented properly with 6 clusters.

In[18]:

import sklearn.cluster as cluster

import sklearn.metrics as metrics
for i in range(2,12):

labels=cluster.KMeans(n_clusters=i,random_state=200).fit(pca_df).labels_
print ("Silhouette score for k(clusters) = "+str(i)+" is "

+str(metrics.silhouette_score(pca_df,labels,metric="euclidean",sample_siz

Out[18]:

Silhouette score for k(clusters) = 2 is 0.4736269407502857

Silhouette score for k(clusters) = 3 is 0.44839082753844756
Silhouette score for k(clusters) = 4 is 0.43785291876777566
Silhouette score for k(clusters) = 5 is 0.45130680489606634
Silhouette score for k(clusters) = 6 is 0.4507847568968469
Silhouette score for k(clusters) = 7 is 0.4458795480456887
Silhouette score for k(clusters) = 8 is 0.4132957148795121
Silhouette score for k(clusters) = 9 is 0.4170428610065107
Silhouette score for k(clusters) = 10 is 0.4309783655094101
Silhouette score for k(clusters) = 11 is 0.42535265774570674

https://machinelearningknow ledge.ai/tutorial-for-k-means-clustering-in-python-sklearn/ 29/35

Filetype PDF The New Corporate Finance Where Theory Meets Practice
No ratings yet
Filetype PDF The New Corporate Finance Where Theory Meets Practice
2 pages
Agile
No ratings yet
Agile
38 pages
Shahapure 2020
No ratings yet
Shahapure 2020
2 pages
Lab Report6 - B21CI014
No ratings yet
Lab Report6 - B21CI014
8 pages
01 K Means - Merged
No ratings yet
01 K Means - Merged
26 pages
AI With Python - Unsupervised Learning - Clustering
No ratings yet
AI With Python - Unsupervised Learning - Clustering
12 pages
ML DSBA Lab7
No ratings yet
ML DSBA Lab7
6 pages
K-Means Clustering Using PCA Analysis Lab Report
No ratings yet
K-Means Clustering Using PCA Analysis Lab Report
9 pages
ML2 Practical List
No ratings yet
ML2 Practical List
80 pages
K-Means in Python - Solution
No ratings yet
K-Means in Python - Solution
6 pages
Unit-4
No ratings yet
Unit-4
46 pages
Determining Clusters
No ratings yet
Determining Clusters
4 pages
Data Science Analysis Final Project
No ratings yet
Data Science Analysis Final Project
10 pages
Practical 03
No ratings yet
Practical 03
3 pages
K-means algoritham
No ratings yet
K-means algoritham
3 pages
Data Mining
No ratings yet
Data Mining
10 pages
EXPERIMENT 9
No ratings yet
EXPERIMENT 9
10 pages
PeerEval Unsupervised
No ratings yet
PeerEval Unsupervised
6 pages
CSC649 Lecture 3 Unsupervised ML - KMeansClustering
No ratings yet
CSC649 Lecture 3 Unsupervised ML - KMeansClustering
22 pages
Expt-6
No ratings yet
Expt-6
3 pages
SE_KMeansClustering
No ratings yet
SE_KMeansClustering
21 pages
Unit_4 (1)
No ratings yet
Unit_4 (1)
63 pages
INSY446 - 10 - Clustering Part 2
No ratings yet
INSY446 - 10 - Clustering Part 2
32 pages
21BEC505 Exp2
No ratings yet
21BEC505 Exp2
7 pages
Peer Eval
No ratings yet
Peer Eval
6 pages
Pattern Recognition Letters: Krista Rizman Z Alik
No ratings yet
Pattern Recognition Letters: Krista Rizman Z Alik
7 pages
K-Means_Clustering_Report
No ratings yet
K-Means_Clustering_Report
2 pages
2403res62 - CS564 - Assignment - 4 - K-Means-Iris - Intrinsic - CVIs
No ratings yet
2403res62 - CS564 - Assignment - 4 - K-Means-Iris - Intrinsic - CVIs
30 pages
02.1 K-Means Example
No ratings yet
02.1 K-Means Example
12 pages
unit-3
No ratings yet
unit-3
130 pages
AI Week 11
No ratings yet
AI Week 11
21 pages
2.3 Aiml Rishit
No ratings yet
2.3 Aiml Rishit
7 pages
UNIT - 3 - Clustering
No ratings yet
UNIT - 3 - Clustering
21 pages
DWDM Lab All
No ratings yet
DWDM Lab All
20 pages
K-Means Clustering Algorithm - Javatpoint
No ratings yet
K-Means Clustering Algorithm - Javatpoint
21 pages
Pranav ML-8
No ratings yet
Pranav ML-8
4 pages
Presentation 1
No ratings yet
Presentation 1
47 pages
Avinash Tiwari 9
No ratings yet
Avinash Tiwari 9
4 pages
Assignment 4 A
No ratings yet
Assignment 4 A
15 pages
K.means Clustering
No ratings yet
K.means Clustering
8 pages
AAM 7th prac
No ratings yet
AAM 7th prac
4 pages
Kmeans Clustering
No ratings yet
Kmeans Clustering
3 pages
09.unsupervised Learning
No ratings yet
09.unsupervised Learning
50 pages
KMean Merged
No ratings yet
KMean Merged
13 pages
JAVIER KMeans Clustering Jupyter Notebook
No ratings yet
JAVIER KMeans Clustering Jupyter Notebook
7 pages
Chapter 2.1 - Kmean
No ratings yet
Chapter 2.1 - Kmean
10 pages
entropy-23-00759
No ratings yet
entropy-23-00759
17 pages
Experiment 3.1 K-Mean
No ratings yet
Experiment 3.1 K-Mean
8 pages
Ds Paper
No ratings yet
Ds Paper
35 pages
K-Means and PCA
No ratings yet
K-Means and PCA
69 pages
Unit 4 Aam
No ratings yet
Unit 4 Aam
26 pages
K Means Clustering
No ratings yet
K Means Clustering
11 pages
K-means
No ratings yet
K-means
26 pages
Experiment 4 1
No ratings yet
Experiment 4 1
4 pages
Lecture 18 K Means Clustering
No ratings yet
Lecture 18 K Means Clustering
77 pages
FullMarks - Clustering StudentSolution 2
No ratings yet
FullMarks - Clustering StudentSolution 2
13 pages
1 s2.0 S0031320319301608 Main
No ratings yet
1 s2.0 S0031320319301608 Main
18 pages
ML-Lab Programs - VTU
No ratings yet
ML-Lab Programs - VTU
5 pages
Detecting Patterns with Unsupervised Learning
No ratings yet
Detecting Patterns with Unsupervised Learning
21 pages
Research On K-Value Selection Method of K-Means Clustering Algorithm
No ratings yet
Research On K-Value Selection Method of K-Means Clustering Algorithm
10 pages
K Means
100% (2)
K Means
329 pages
Machine Learning in the AWS Cloud: Add Intelligence to Applications with Amazon SageMaker and Amazon Rekognition
From Everand
Machine Learning in the AWS Cloud: Add Intelligence to Applications with Amazon SageMaker and Amazon Rekognition
Abhishek Mishra
No ratings yet
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-C
No ratings yet
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-C
10 pages
2 Mapreduce Model Principles
No ratings yet
2 Mapreduce Model Principles
7 pages
Hadoop
No ratings yet
Hadoop
7 pages
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-A
No ratings yet
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-A
7 pages
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-H
No ratings yet
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-H
4 pages
MapReduce - What It Is, and Why It Is So Popular
No ratings yet
MapReduce - What It Is, and Why It Is So Popular
7 pages
Paper Dvi
No ratings yet
Paper Dvi
7 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-1E
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-1E
2 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-1Q
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-1Q
2 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-17
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-17
3 pages
Balanced K-Means Revisited-5
No ratings yet
Balanced K-Means Revisited-5
3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-16
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-16
3 pages
Balanced K-Means Revisited-1
No ratings yet
Balanced K-Means Revisited-1
3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-O
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-O
3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-9
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-9
4 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-14
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-14
3 pages
Tutorial For K Means Clustering in Python Sklearn - MLK - Machine Learning Knowledge-5
No ratings yet
Tutorial For K Means Clustering in Python Sklearn - MLK - Machine Learning Knowledge-5
3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-P
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-P
3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-A
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-A
6 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community
3 pages
Fast Scalable K-Means++ Algorithm With Mapreduce
No ratings yet
Fast Scalable K-Means++ Algorithm With Mapreduce
2 pages
A Distance-Based Kernel For Classification Via Support Vector Machines - PMC-17
No ratings yet
A Distance-Based Kernel For Classification Via Support Vector Machines - PMC-17
1 page
Data Visualization Cheat Sheet For Basic Machine Learning Algorithms - by Boriharn K - Mar, 2024 - Towards Data Science
No ratings yet
Data Visualization Cheat Sheet For Basic Machine Learning Algorithms - by Boriharn K - Mar, 2024 - Towards Data Science
3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-5
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-5
4 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-4
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-4
3 pages
Improved K-Means Map Reduce Algorithm For Big Data Cluster Analysis
No ratings yet
Improved K-Means Map Reduce Algorithm For Big Data Cluster Analysis
7 pages
K-Means Clustering Optimization Algorithm Based On Mapreduce
No ratings yet
K-Means Clustering Optimization Algorithm Based On Mapreduce
6 pages
The Incremental Online K Means Clustering Algorithm and Its Application To Color Quantization
No ratings yet
The Incremental Online K Means Clustering Algorithm and Its Application To Color Quantization
42 pages
Fuzzy K-Mean Clustering in Mapreduce On Cloud Based Hadoop: Dweepna Garg
No ratings yet
Fuzzy K-Mean Clustering in Mapreduce On Cloud Based Hadoop: Dweepna Garg
4 pages
Analysis of Mapreduce Algorithms: Harini Padmanaban
No ratings yet
Analysis of Mapreduce Algorithms: Harini Padmanaban
6 pages
The Wyckoff Method
No ratings yet
The Wyckoff Method
4 pages
Mk Mille 255207 Spring 2023
No ratings yet
Mk Mille 255207 Spring 2023
2 pages
Bryman & Bell,, Chapter 16-17, 22
No ratings yet
Bryman & Bell,, Chapter 16-17, 22
24 pages
Unit 2 - On The Nature of Child Language Acquisition
No ratings yet
Unit 2 - On The Nature of Child Language Acquisition
16 pages
3adw000076r0701 swd6 e G
No ratings yet
3adw000076r0701 swd6 e G
428 pages
UTFT Image Converters PDF
No ratings yet
UTFT Image Converters PDF
3 pages
40 Questions To Test Your Skill On R For Data Science
No ratings yet
40 Questions To Test Your Skill On R For Data Science
32 pages
Overlord - Blu-Ray 1 Special - Emissary of The King
No ratings yet
Overlord - Blu-Ray 1 Special - Emissary of The King
75 pages
Do Environmental Laws Promote Cleaner Environment? - Indian Scenario
100% (1)
Do Environmental Laws Promote Cleaner Environment? - Indian Scenario
12 pages
A Comparative Study of Software Development Life Cycle Models
No ratings yet
A Comparative Study of Software Development Life Cycle Models
7 pages
Week 3 - Statistical hypothesis testing
No ratings yet
Week 3 - Statistical hypothesis testing
18 pages
Logical Resoning 22
No ratings yet
Logical Resoning 22
9 pages
The Posttraumatic Cognitions Inventory (PTCI) : Development and Validation
No ratings yet
The Posttraumatic Cognitions Inventory (PTCI) : Development and Validation
12 pages
Term Paper OF Event Management (MGT-695) On The Topic "Organising An Inter
No ratings yet
Term Paper OF Event Management (MGT-695) On The Topic "Organising An Inter
15 pages
Vipassana 2
No ratings yet
Vipassana 2
11 pages
Nursing As Caring
No ratings yet
Nursing As Caring
3 pages
Flatland
No ratings yet
Flatland
169 pages
Curriculum Map Tle 7
100% (4)
Curriculum Map Tle 7
2 pages
Playground AI
No ratings yet
Playground AI
1 page
Theory Review Packet
100% (1)
Theory Review Packet
5 pages
Understanding A Poem
No ratings yet
Understanding A Poem
9 pages
Dr. Cesar A. Villariba Research and Knowledge Management Institute
No ratings yet
Dr. Cesar A. Villariba Research and Knowledge Management Institute
4 pages
3600 - Hi-Lo Pilot Switch
No ratings yet
3600 - Hi-Lo Pilot Switch
10 pages
Born-Oppenheimer Presentation
100% (1)
Born-Oppenheimer Presentation
15 pages
Effects of PH, Temperature and Concentration On Enzyme Pepsin
No ratings yet
Effects of PH, Temperature and Concentration On Enzyme Pepsin
4 pages
To Kill A Mockingbird Lesson Plans
No ratings yet
To Kill A Mockingbird Lesson Plans
8 pages
WebMethods Repeat
No ratings yet
WebMethods Repeat
4 pages
Technology As A Way of Revealing
No ratings yet
Technology As A Way of Revealing
33 pages