SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers!

View Only

Back to discussions

Expand all | Collapse all

Cluster Analysis K-Means

1. Cluster Analysis K-Means

Like
Community Support Admin
Posted Tue October 27, 2020 08:16 PM

Reply
When I create clusters with "K Means" using the same clustering variables in two different sets of identical data I get different cluster sizes. My syntax is the same. I am not able to reproduce my results across identical data sets.

#SPSSStatistics
#Support
#SupportMigration
2. RE: Cluster Analysis K-Means

Like
Community Support Admin
Posted Tue October 27, 2020 09:38 PM

Reply
K-Means is sensitive to the starting values for the cluster centers and even to the order of the cases. It is often suggested to run it with different starting values to find stable clusters. If you save the cluster centers from a run, you can start with those for a subsequent run or make up your own initial centers in a dataset.

#SPSSStatistics
#Support
#SupportMigration
3. RE: Cluster Analysis K-Means

Like
Community Support Admin
Posted Sat October 31, 2020 11:01 PM

Reply
Thanks for your response. I reordered my data in various ways and found that the clusters differed, some times significantly.
You suggested using different starting values to find stable clusters. What defines a stable cluster?

#SPSSStatistics
#Support
#SupportMigration
4. RE: Cluster Analysis K-Means

Like
Community Support Admin
Posted Sat October 31, 2020 11:17 PM

Reply
Cluster analysis is an ad hoc sort of method, so there is no definite rule about what is best in terms of centers and number of clusters, but here are some possibilities.
Across different starting values look for some consensus in cluster size and means
Select initial cluster centers by using Ward's method, which is available in the hierarchical cluster procedure
Look at cluster silhouette's to see whether the next best clusters make sense and how well separated the clusters are. You can get this from the STATS CLUS SIL extension command. You can install the extension via the Extensions > Extension Hub menu. See https://en.wikipedia.org/wiki/Silhouette_(clustering) on how to use silhouettes

#SPSSStatistics
#Support
#SupportMigration

SPSS Statistics

SPSS Statistics

Cluster Analysis K-Means

Community Support AdminTue October 27, 2020 08:16 PM

Community Support AdminTue October 27, 2020 09:38 PM

Community Support AdminSat October 31, 2020 11:01 PM

Community Support AdminSat October 31, 2020 11:17 PM

1. Cluster Analysis K-Means

2. RE: Cluster Analysis K-Means

3. RE: Cluster Analysis K-Means

4. RE: Cluster Analysis K-Means

Additional
Resources

Office

Quick Links

SPSS Statistics

SPSS Statistics

Cluster Analysis K-Means

Community Support AdminTue October 27, 2020 08:16 PM

Community Support AdminTue October 27, 2020 09:38 PM

Community Support AdminSat October 31, 2020 11:01 PM

Community Support AdminSat October 31, 2020 11:17 PM

1. Cluster Analysis K-Means

2. RE: Cluster Analysis K-Means

3. RE: Cluster Analysis K-Means

4. RE: Cluster Analysis K-Means

Related Content

cluster silhouettes

RE: Cluster silhouettes

K-means Clustering: "Not enough cases to perform cluster analysis"

Cluster analysis

how to read in an external file into quick cluster (k-means) analysis in order to use initial cluster centers?

Additional Resources

Office

Quick Links

Additional
Resources