SPSS Statistics

 View Only
  • 1.  Merging two datasets generate a blank dataset

    Posted Wed February 21, 2024 09:57 AM

    Hi,

    When I tried to merge dataset A with B using ID as a Key (one to one merge), it just generates a blank dataset.

    Both datasets A and B also become blank. It says: 

    >Warning # 5132 
    >Duplicate key in a file.  The BY variables do not uniquely identify each case 
    >on the indicated file.  Please check the results carefully. 
    File #2 

    But there are no duplicates in ID for both datasets A and B. IDs are sorted

    I have done this many times and had never such an issue.

    What is going on? Please help.

    YN


     



    ------------------------------
    Nishio Masako
    ------------------------------


  • 2.  RE: Merging two datasets generate a blank dataset

    Posted Wed February 21, 2024 12:27 PM
    There must be a duplicate.  Use Data > Identify Duplicate Cases to see which id's are duplicated.

    --





  • 3.  RE: Merging two datasets generate a blank dataset

    Posted Thu February 22, 2024 01:16 AM

    After spending a few hours, I figured it out by myself.

    I did run "identify duplicate" first. No duplicate. But when I ran descriptives of the key variable, ID, I found that there were 963 missing values!

    What happened was that when I imported the data from Excel to SPSS, it added so many empty rows. That is why when two data sets were merged, there were so many rows with missing values, which made me wonder why the data set looked completely blank as those empty rows came at the top of the merged data.

    Now the mystery was that the original Excel data does NOT have such empty rows at the bottom. It seemed that they were added when the data was exported to SPSS. I do not know why. But once I deleted 963 empty rows from SPSS data, I was able to merge the data with no problem.

    YN



    ------------------------------
    Nishio Masako
    ------------------------------



  • 4.  RE: Merging two datasets generate a blank dataset

    Posted Thu February 22, 2024 09:40 AM
    Multiple missing values in the id variable should count as duplicates in Identify Duplicate Cases.

    As for the Excel import, there might have been a blank rather than empty cell or some other nonempty cell that caused the extra rows to be imported.

    --