SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers!

View Only

Back to discussions

Expand all | Collapse all

K-means initial centroids

1. K-means initial centroids

Like
Dr. Edward Vieira
Posted Mon September 16, 2024 10:05 AM

Reply
I am writing the second edition of my statistics textbook which features SPSS and is published by Routledge. I have added a cluster analysis chapter and have a question. In SPSS, the k-means algorithm uses mostly random initialization for the selection of initial cluster centroids. Although it is mostly random, it does incorporate some internal methods to ensure that the clustering algorithm is effective. The SPSS documentation and user guides do not specifically mention the use of additional methods to refine or guide the random selection process in the standard implementation. Can you provide me with specific information about the deterministic component of initial centroid selection? Please apprise me of any additional information that you require. Thanks a bunch! Ed

------------------------------
Dr. Edward Vieira
------------------------------
2. RE: K-means initial centroids

Like
Aruna Saraswathy
Posted Tue September 17, 2024 10:27 AM

Reply
Hi Dr. Edward Vieira,

SPSS uses the Maximin distance algorithm for centroid initialization, which helps in placing centroids at maximum distances from each other to improve clustering results.

------------------------------
Aruna Saraswathy
Statistician
SPSS Statistics
IBM
------------------------------

Original Message
3. RE: K-means initial centroids

Like
Dr. Edward Vieira
Posted Wed September 18, 2024 08:11 AM

Reply
Thanks a bunch, Aruna!

------------------------------
Dr. Edward Vieira
------------------------------

Original Message
4. RE: K-means initial centroids

Like
Kirill Orlov
Posted Wed September 18, 2024 06:42 AM
Edited by Kirill Orlov Wed September 18, 2024 07:11 AM

Reply
Spss QUICK CLUSTER (k-means) procedure uses, in automatic mode, the farthest-points-running-selection algorithm to produce initial cluster centres. It is what Aruna referred to as maximal distance algorithm. (and the algo - which is deterministic, although may be sensitive to the case order in the data - is described in the "SPSS Statistics Algorithms" doc).

My macro !KO_KMINI offers, in addition to it, 6 more methods to initialize initial cluster centres for k-means. The macro is on my page "Kirill's SPSS macros" in the collection "Clustering".

------------------------------
Kirill Orlov
------------------------------

Original Message
5. RE: K-means initial centroids

Like
Dr. Edward Vieira
Posted Wed September 18, 2024 08:10 AM

Reply
Thank you, Kirill.

Ed

------------------------------
Dr. Edward Vieira
------------------------------

Original Message
6. RE: K-means initial centroids

Like
Jon Peck

IBM Champion
Posted Wed September 18, 2024 09:20 AM

Reply
There are six clustering procedures available in SPSS: four built in and two extensions. And then Kirill's macros. It would be interesting to see a comparative analysis of all of these.

--
Jon K Peck
jkpeck@gmail.com

Original Message
7. RE: K-means initial centroids

Like
Dr. Edward Vieira
Posted Thu September 19, 2024 06:18 AM

Reply
Thank you, Jon. I appreciate it. Ed

------------------------------
Dr. Edward Vieira
------------------------------

Original Message

SPSS Statistics

SPSS Statistics

K-means initial centroids

Dr. Edward VieiraMon September 16, 2024 10:05 AM

Aruna SaraswathyTue September 17, 2024 10:27 AM

Dr. Edward VieiraWed September 18, 2024 08:11 AM

Kirill OrlovWed September 18, 2024 06:42 AM

Dr. Edward VieiraWed September 18, 2024 08:10 AM

Jon PeckWed September 18, 2024 09:20 AM

Dr. Edward VieiraThu September 19, 2024 06:18 AM

1. K-means initial centroids

2. RE: K-means initial centroids

3. RE: K-means initial centroids

4. RE: K-means initial centroids

5. RE: K-means initial centroids

6. RE: K-means initial centroids

7. RE: K-means initial centroids

Additional
Resources

Office

Quick Links

SPSS Statistics

SPSS Statistics

K-means initial centroids

Dr. Edward VieiraMon September 16, 2024 10:05 AM

Aruna SaraswathyTue September 17, 2024 10:27 AM

Dr. Edward VieiraWed September 18, 2024 08:11 AM

Kirill OrlovWed September 18, 2024 06:42 AM

Dr. Edward VieiraWed September 18, 2024 08:10 AM

Jon PeckWed September 18, 2024 09:20 AM

Dr. Edward VieiraThu September 19, 2024 06:18 AM

1. K-means initial centroids

2. RE: K-means initial centroids

3. RE: K-means initial centroids

4. RE: K-means initial centroids

5. RE: K-means initial centroids

6. RE: K-means initial centroids

7. RE: K-means initial centroids

Related Content

News from Kirill's SPSS Macros page

News from Kirill's SPSS macros page

Inaccurate documentation: SPSS Statistics Algorithms - FACTOR - Orthogonal Rotations

Dot plot of means with confidence intervals as ellipses

ADP, Feature Construction algorithm - a problem

Additional Resources

Office

Quick Links

Additional
Resources