SPSS Statistics

SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers! 

 View Only
Expand all | Collapse all

computing a variable issue

  • 1.  computing a variable issue

    Posted Wed September 07, 2022 03:55 PM

    I have combined two datasets (2020 and 2021 data).  Each dataset has pre- test scales and post-test scales which I have given separate names because I want to be able to compare the years.  I am trying to create a combined pre-test variable and a combined post-test variable to run paired t-tests.

     

    First I used this command, simply summing the two individual years data: 

     

    COMPUTE POVSUMPRE20_21=povpresum21 + povpre20sum.

    EXECUTE.

     

    The resulting variable was essentially empty, will all the data missing except, oddly, for one subject with a score of 181, which is higher than is possible on the scale.  There is nothing different about this subject: number 88 of 195.

     

    So I tried listing each scale item individually and summing them to create the variable:

    COMPUTE POVPRESUM2021=povpre2001 + povpre2002 + povpre2003 + povpre2004 + povpre2005 + povpre2006 +

        povpre2007 + povpre2008 + povpre2009 + povpre2010 + povpre2011 + povpre2012 + povpre2013 +

        povpre2014 + povpre2015 + povpre2016 + povpre2017 + povpre2018 + povpre2019 + povpre2020 +

        povpre2021 + Pov1pre21 + Pov2pre21 + Pov3pre21 + Pov4pre21 + Pov5pre21 + Pov6pre21 + Pov7pre21 +

        Pov8pre21 + Pov9pre21 + Pov10pre21 + Pov11pre21 + Pov12pre21 + Pov13pre21 + Pov14pre21 + Pov15pre21

        + Pov16pre21 + Pov17pre21 + Pov18pre21 + Pov19pre21 + Pov20pre21 + Pov21pre21.

    EXECUTE.

     

    The same outcome precisely.  The same subject 88 has a score of 181, and everyone else is missing.

     

     

    POVPRESUM2021

     

    Frequency

    Percent

    Valid Percent

    Cumulative Percent

    Valid

    181.00

    1

    .5

    100.0

    100.0

    Missing

    System

    194

    99.5

     

     

    Total

    195

    100.0

     

     

     

     

     

    The individual scale items are fine, and the scales for each year are fine.

     

    Statistics

     

    povpre20sum

    povpresum21

    N

    Valid

    90

    106

    Missing

    105

    89

    Mean

    80.91

    79.2547

    Std. Deviation

    9.259

    10.48179

    Minimum

    60

    56.00

    Maximum

    101

    102.00

    Sum

    7282

    8401.00

     

    I don't know what I am doing wrong. 

     

    Thank you for any suggestions.

     

    Barbara

     

     


    #SPSSStatistics


  • 2.  RE: computing a variable issue

    Posted Wed September 07, 2022 03:57 PM
    Hi.

    Did you get any warnings in the output window?

    ------------------------------
    Rick Marcantonio
    Quality Assurance
    IBM
    ------------------------------



  • 3.  RE: computing a variable issue

    Posted Wed September 07, 2022 04:06 PM

    No, oddly there was not.

     

    Barbara

     






  • 4.  RE: computing a variable issue

    Posted Wed September 07, 2022 04:01 PM
    Looking at the table you provided gives us a clue. I'm thinking that the files are not matched correctly. Did you MATCH the data files on an ID variable of some kind?

    ------------------------------
    Rick Marcantonio
    Quality Assurance
    IBM
    ------------------------------



  • 5.  RE: computing a variable issue

    Posted Wed September 07, 2022 04:01 PM
    Presumably, one of the variables is always missing, so the sum (+) would always be missing.  You could use the sum function to combine them as it would just ignore missing values.
    As for that one odd case, there must be something different about it.
    --





  • 6.  RE: computing a variable issue

    Posted Wed September 07, 2022 04:32 PM
    Barbara - I'm pretty sure that the data files were combined in a way you did not intend. That can happen when I use ADD FILES when I meant to use MATCH FILES, for example. Please take a look at the data and see if it's set up the way you want. Or, feel free to send it to me; I'm happy to look at it for you.

    marcantr@us.ibm.com

    ------------------------------
    Rick Marcantonio
    Quality Assurance
    IBM
    ------------------------------



  • 7.  RE: computing a variable issue

    Posted Wed September 07, 2022 04:43 PM
      |   view attached

    Bless your heart!  Hopefully the variable names make sense.  The participants from 20 and 21 are not the same people, which is why there are so many missings.  I have not yet combined 20 and 21 variables, so the 2020 participants have their variables, as do the 2021 participants.  That is, of course, what I am attempting to do. 

     

    I have not deleted the string variables, so there are a lot of variables in the dataset. 

     

    I really, really appreciate this. 

     

    Barbara

     




    Attachment(s)

    sav
    20_21 PRE-POST pov_UWE.sav   351 KB 1 version


  • 8.  RE: computing a variable issue

    Posted Wed September 14, 2022 01:59 PM
    Edited by System Admin Fri January 20, 2023 04:35 PM
    For the COMPUTEs to work, the cases that were merged have to be the same (either the same person measured at two points in time or two separate people matched on some dimensions). Either way, the merge would have to happen with some identifying variable that has only one unique value for each case. In this dataset, as you said, they are not the same people. That's why you're getting the results that you are. 

    Here's a simple example so you can see what happens when cases aren't matched.

    ***.

    output close all.
    dataset close all.
    new file.

    data list free /id x1 y1.
    begin data.
    01 1 3
    02 3 2
    04 1 5
    05 2 4
    07 1 7
    09 2 6
    end data.
    dataset name file1.

    data list free /id x2 y2.
    begin data.
    03 3 3
    06 4 6
    08 1 3
    09 5 3
    end data.
    dataset name file2.

    match files /file=file1 /file=file2 /by id.
    dataset name matched_files.

    dataset close file1.
    dataset close file2.

    list.

    ***.

    You see? Cases are sorted by ID, so they appear in the merged dataset in sequence. But they are different cases (IDs are not the same in the two files, except for one case - 9 - which just happens to be in both datasets by chance), so what is missing in the one dataset is observed in the other. No COMPUTE statement that uses variables drawn from both datasets could possibly work due to the missing values.

    ------------------------------
    Rick Marcantonio
    Quality Assurance
    IBM
    ------------------------------



  • 9.  RE: computing a variable issue

    Posted Thu September 15, 2022 03:47 PM

    Thank you, that makes sense.  I was thinking incorrectly about the command and what it would accomplish.  I need to be subscribed to the beginner version of the community.

     

    Barbara

     






  • 10.  RE: computing a variable issue

    Posted Thu September 15, 2022 05:03 PM
    Edited by System Admin Fri January 20, 2023 04:26 PM
    :)