Congratulation, you found an error in the code.
I don't know why I did not see it before...
The error is that line should not be there at all.
I updated the notebook and re-ran it. It turns out there is no "elbow".
The distortion seems to be decreasing steadily.
You can view the updated notebook here:
http://bit.ly/W002-ClusteringCustomersViewNotebook------------------------------
Jacques Roy
Digital Technical Engagement
Watson Data and AI
Test Drive Our Digital Offerings! ibm.biz/dte-live
Engage DTE at: ibm.biz/dte-request
Byte-size data science channel: youtube.com/c/ByteSizeDataScience
------------------------------
Original Message:
Sent: Mon April 29, 2019 07:35 PM
From: Tom Weichle
Subject: Into Data Science: Understanding K-Means --> Follow-up question
Hi Jacques,
In cell #9 of the Jupyter Notebook for this webinar, I have a question regarding the for loop that is being used to calculate the distortions for the elbow method.
I understand the following line within the for loop where you are fitting the K-Means model on the scaled X values:
KmeanModel = KMeans(n_clusters=k).fit(X_scaled)
However, the following line in the cell I don't understand why you are doing this:
KmeanModel.fit(X)
Why are you refitting the model on the original values of X before the calculations of the distortions?
Thanks!
Tom
------------------------------
Thomas Weichle
------------------------------
#GlobalAIandDataScience
#GlobalDataScience