For the COMPUTEs to work, the cases that were merged have to be the same (either the same person measured at two points in time or two separate people matched on some dimensions). Either way, the merge would have to happen with some identifying variable that has only one unique value for each case. In this dataset, as you said, they are not the same people. That's why you're getting the results that you are.
Here's a simple example so you can see what happens when cases aren't matched.
***.
output close all.
dataset close all.
new file.
data list free /id x1 y1.
begin data.
01 1 3
02 3 2
04 1 5
05 2 4
07 1 7
09 2 6
end data.
dataset name file1.
data list free /id x2 y2.
begin data.
03 3 3
06 4 6
08 1 3
09 5 3
end data.
dataset name file2.
match files /file=file1 /file=file2 /by id.
dataset name matched_files.
dataset close file1.
dataset close file2.
list.
***.
You see? Cases are sorted by ID, so they appear in the merged dataset in sequence. But they are different cases (IDs are not the same in the two files, except for one case - 9 - which just happens to be in both datasets by chance), so what is missing in the one dataset is observed in the other. No COMPUTE statement that uses variables drawn from both datasets could possibly work due to the missing values.
------------------------------
Rick Marcantonio
Quality Assurance
IBM
------------------------------
Original Message:
Sent: Wed September 07, 2022 04:42 PM
From: Barbara Wise
Subject: computing a variable issue
Bless your heart! Hopefully the variable names make sense. The participants from 20 and 21 are not the same people, which is why there are so many missings. I have not yet combined 20 and 21 variables, so the 2020 participants have their variables, as do the 2021 participants. That is, of course, what I am attempting to do.
I have not deleted the string variables, so there are a lot of variables in the dataset.
I really, really appreciate this.
Barbara
Original Message:
Sent: 9/7/2022 4:32:00 PM
From: Rick Marcantonio
Subject: RE: computing a variable issue
Barbara - I'm pretty sure that the data files were combined in a way you did not intend. That can happen when I use ADD FILES when I meant to use MATCH FILES, for example. Please take a look at the data and see if it's set up the way you want. Or, feel free to send it to me; I'm happy to look at it for you.
marcantr@us.ibm.com
------------------------------
Rick Marcantonio
Quality Assurance
IBM
Original Message:
Sent: Wed September 07, 2022 01:32 PM
From: Barbara Wise
Subject: computing a variable issue
I have combined two datasets (2020 and 2021 data). Each dataset has pre- test scales and post-test scales which I have given separate names because I want to be able to compare the years. I am trying to create a combined pre-test variable and a combined post-test variable to run paired t-tests.
First I used this command, simply summing the two individual years data:
COMPUTE POVSUMPRE20_21=povpresum21 + povpre20sum.
EXECUTE.
The resulting variable was essentially empty, will all the data missing except, oddly, for one subject with a score of 181, which is higher than is possible on the scale. There is nothing different about this subject: number 88 of 195.
So I tried listing each scale item individually and summing them to create the variable:
COMPUTE POVPRESUM2021=povpre2001 + povpre2002 + povpre2003 + povpre2004 + povpre2005 + povpre2006 +
povpre2007 + povpre2008 + povpre2009 + povpre2010 + povpre2011 + povpre2012 + povpre2013 +
povpre2014 + povpre2015 + povpre2016 + povpre2017 + povpre2018 + povpre2019 + povpre2020 +
povpre2021 + Pov1pre21 + Pov2pre21 + Pov3pre21 + Pov4pre21 + Pov5pre21 + Pov6pre21 + Pov7pre21 +
Pov8pre21 + Pov9pre21 + Pov10pre21 + Pov11pre21 + Pov12pre21 + Pov13pre21 + Pov14pre21 + Pov15pre21
+ Pov16pre21 + Pov17pre21 + Pov18pre21 + Pov19pre21 + Pov20pre21 + Pov21pre21.
EXECUTE.
The same outcome precisely. The same subject 88 has a score of 181, and everyone else is missing.
POVPRESUM2021
|
|
Frequency
|
Percent
|
Valid Percent
|
Cumulative Percent
|
Valid
|
181.00
|
1
|
.5
|
100.0
|
100.0
|
Missing
|
System
|
194
|
99.5
|
|
|
Total
|
195
|
100.0
|
|
|
The individual scale items are fine, and the scales for each year are fine.
Statistics
|
|
povpre20sum
|
povpresum21
|
N
|
Valid
|
90
|
106
|
Missing
|
105
|
89
|
Mean
|
80.91
|
79.2547
|
Std. Deviation
|
9.259
|
10.48179
|
Minimum
|
60
|
56.00
|
Maximum
|
101
|
102.00
|
Sum
|
7282
|
8401.00
|
I don't know what I am doing wrong.
Thank you for any suggestions.
Barbara
#SPSSStatistics