SPSS Statistics

SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers! 

 View Only
  • 1.  Cluster Analysis K-Means

    Posted Tue October 27, 2020 08:16 PM

    When I create clusters with "K Means" using the same clustering variables in two different sets of identical data I get different cluster sizes. My syntax is the same. I am not able to reproduce my results across identical data sets.






    #SPSSStatistics
    #Support
    #SupportMigration


  • 2.  RE: Cluster Analysis K-Means

    Posted Tue October 27, 2020 09:38 PM

    K-Means is sensitive to the starting values for the cluster centers and even to the order of the cases. It is often suggested to run it with different starting values to find stable clusters. If you save the cluster centers from a run, you can start with those for a subsequent run or make up your own initial centers in a dataset.






    #SPSSStatistics
    #Support
    #SupportMigration


  • 3.  RE: Cluster Analysis K-Means

    Posted Sat October 31, 2020 11:01 PM

    Thanks for your response. I reordered my data in various ways and found that the clusters differed, some times significantly.

    You suggested using different starting values to find stable clusters. What defines a stable cluster?






    #SPSSStatistics
    #Support
    #SupportMigration


  • 4.  RE: Cluster Analysis K-Means

    Posted Sat October 31, 2020 11:17 PM

    Cluster analysis is an ad hoc sort of method, so there is no definite rule about what is best in terms of centers and number of clusters, but here are some possibilities.

    • Across different starting values look for some consensus in cluster size and means
    • Select initial cluster centers by using Ward's method, which is available in the hierarchical cluster procedure
    • Look at cluster silhouette's to see whether the next best clusters make sense and how well separated the clusters are. You can get this from the STATS CLUS SIL extension command. You can install the extension via the Extensions > Extension Hub menu. See https://en.wikipedia.org/wiki/Silhouette_(clustering) on how to use silhouettes







    #SPSSStatistics
    #Support
    #SupportMigration