SPSS Statistics

SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers! 

 View Only
Expand all | Collapse all

K-means Clustering: "Not enough cases to perform cluster analysis"

  • 1.  K-means Clustering: "Not enough cases to perform cluster analysis"

    Posted Thu September 03, 2020 08:39 PM

    Hello,

    As you can read in the title I get the error message "Not enough cases to perform cluster analysis" after trying K-Means Clustering including all the variables (or columns).

    I will try to provide all the information that you might need to be able to help me. Ask me if need anything else.

    - The data base is from a survey with 409 participants, and the survey had 19 questions (many of

    them with multiple responses) getting a total of 58 columns.

    - The data base that was loaded to SPSS, was "coded" or "transformed" (I dont know the correct word,

    I think its Liker scale) with numbers eg: 1 if answer "unemployed" ... 6 if "self-employed" and so

    on with al the columns and then indicated in the variable what every number meant or tag.

    - Since there is a lot of multiple response questions, I have a lot of empty cells or "missing

    values". So I tried to replace these empty cells with "0" or "9" and indicate in the program that

    these numbers corresponded to missing values, but it didn't change anything.

    - I used the option to create the multiple response set that were necessary. Also all the variables are nominal.

    - I tried to find any similar post that could of have helped me, but I didn't find anything.

    - Sorry if I made too many grammar mistakes, English is not my first language.






    #SPSSStatistics
    #Support
    #SupportMigration


  • 2.  RE: K-means Clustering: "Not enough cases to perform cluster analysis"
    Best Answer

    Posted Fri September 04, 2020 01:53 AM

    Thanks for the details. The problem is the missing values. By default, the procedure excludes any case where at least one variable in the specification is missing.

    You can choose using the Options subdialog to delete missing values pairwise instead of listwise. But the procedure does not use multiple response sets. If you have a multiple dichotomy set, then you can use those elementary variables as input, but if it is a multiple category set, you do have missing values that won't work.

    However, k-means is not well suited to that type of data. You might want to try twostep cluster instead. Or consider hierarchical clustering or maybe nearest neigbor depending on your purpose. Hierarchical gives you a lot more choices for the distance measure.






    #SPSSStatistics
    #Support
    #SupportMigration


  • 3.  RE: K-means Clustering: "Not enough cases to perform cluster analysis"
    Best Answer

    Posted Fri September 04, 2020 12:42 PM

    I meant to mention that if you have a multiple category set, you can convert it to a multiple dichotomy set using the STATS MCSET CONVERT extension command, which you can install from the Extensions > Extension Hub menu. This command creates a set of dummy variables for the categories, so you can then use those variables in your clustering without the issue of missing values.






    #SPSSStatistics
    #Support
    #SupportMigration


  • 4.  RE: K-means Clustering: "Not enough cases to perform cluster analysis"
    Best Answer

    Posted Sun September 06, 2020 10:18 PM

    Jon, thank you so much !! . The extension that you mentioned worked perfectly. But for some reason I still have to use K-Means clustering because with Two Steps or hierarchical clustering I get too many lost values. But I'm fine with using K-Means clustering.

    Again Jon, I'm really gratefull for your fast and helpfull answer.

    Cheers.







    #SPSSStatistics
    #Support
    #SupportMigration