It's not a giant dataset, but it has a theoretical maximum of
3,823,316,864
tests for a match. The actual amount would, of course, be a lot smaller, but one of the output tables from the procedure gives you an idea. And each test would involve multiple match variables and it would be maintaining 61,778 lists of controls, so it is doing a lot of work. It keeps all these lists in memory, so it could be forcing the operating system to do a lot of swapping to disk, depending on how much free memory it had.
--
Original Message:
Sent: 6/1/2023 6:17:00 PM
From: Kelly Clausen
Subject: RE: Is it user error? Propensity Score Matching in SPSS does not seem to be working?
Hi Jon,
Thank you for your explanations above and for all the help. I have 123,666 observations (cases) total in the dataset. I have 61,833 controls. Using IBM SPSS for Mac V29.0.0.0 Fuzzy 2.0.1. However, I see now in Extension Hub there is an update to Fuzzy V2.1.0 and have just now updated to that version. I'm setting a tight tolerance here because I do want that for my analysis but this is actually now taking longer than 12 hours and it is not that large a data set. I think something has gotten gummed up here. Might have been the Fuzzy update. Now that I have updated that, I will try this again in the am! Thanks again all for the help!
Yes, using Fuzzy through the Stats PSM command.
------------------------------
Kelly Clausen
------------------------------
Original Message:
Sent: Thu June 01, 2023 05:35 PM
From: Jon Peck
Subject: Is it user error? Propensity Score Matching in SPSS does not seem to be working?
Would you describe the data? How many cases; how many controls; what the match specifications are? Are you using the latest version of FUZZY (2.1.0) or an older one?
I have had clients occasionally report runs that took 12 hours, but that was with millions of case and using the older algorithm.
One thing that can affect the runtime a lot is the tolerance (FUZZ) values you use. The procedure works by first finding for each demander case all the supplier cases that are within the tolerance values. Then it takes the closest set of cases (best match) not already used in a match and removes the selected case from the eligible cases for future demanders. It has a built-in worst case limit on the number of potential matches for each demander case. There is a new parameter, MAXQUEUE, that specifies how to scale down the limit on the maximum number of supplier candidates to consider for a demander. It must be a positive fraction less than or equal to 1. The dialog and syntax help for FUZZY explain the algorithm in detail. Using this might significantly cut down the runtime at the possible expense of matches that are a little worse.
Of course, while the procedure is running, the output dataset doesn't show values, because they are pending until the procedure completes.
If you are using FUZZY through the STATS PSM command, I didn't add MAXQUEUE to that procedure, but maybe I should.
So more details would help to assess the situation.
------------------------------
Jon Peck
Original Message:
Sent: Thu June 01, 2023 05:08 PM
From: Kelly Clausen
Subject: Is it user error? Propensity Score Matching in SPSS does not seem to be working?
Hi David,
Yes, I just let it run and run yesterday and lo and behold! It finally spit out the data file after about 2 days of continuous running! Now, I needed to adjust the tolerance a bit and so did that and started it again and it has been running non-stop since 6 pm yesterday with same "?" all over the data set, no error in output, I can even see the propensity var having been added in the variable view and still waiting for the actual output dataset again. Statistics doesn't seem to be frozen (just running and running) and my Mac is not at all frozen. No crashes either and absolutely no error messages anywhere in output or from my machine!
------------------------------
Kelly Clausen
Original Message:
Sent: Wed May 31, 2023 10:27 AM
From: David Dwyer
Subject: Is it user error? Propensity Score Matching in SPSS does not seem to be working?
Hi @Kelly Clausen ,
Rhonda will be reaching out to you via Support case TS013120819.
At face value, this seems like a resource issue. Whether that is the result of some misspecification in the extension command syntax or some inherent issue in your data remains to be seen.
Have you ever just let the command run? Does Statistics (or your whole mac) freeze up or crash? When it does, are there errors?
------------------------------
David Dwyer
SPSS Technical Support
IBM Software
Original Message:
Sent: Tue May 30, 2023 06:24 PM
From: Kelly Clausen
Subject: Is it user error? Propensity Score Matching in SPSS does not seem to be working?
Hello!
So I am sure there is some form of user error at work here but for some reason I am doing everything I have done previously when conducting propensity score matching and something is not working correctly in SPSS? First of all, the output data file, while the variable view has all the VAR names, the only thing showing in the Data View are question marks? Literally "?" in every single field? I get no errors in the output of any kind. And it sounds like my computer is working and working and working and still not outputting the data file? Also, I am using SPSS on a Mac for the first time ever! Is this a problem with SPSS for Mac? Any insights are deeply appreciated? I've been trying to troubleshoot this for days and it still just keeps not really outputting this all question mark data set? Thank you for any insights anyone can offer!
------------------------------
Kelly Clausen
------------------------------