I have been using on a regular monthly generated dataset 5 dimensions with 33 categories. I have never had a problem.
This month added a sixth dimension with 3 categories and it generated a weight for each case of .909 (with there being some variation after the .909). It ran without crashing and there was no error in the log - but the weight was clearly incorrect. I did it several times with the same result and the weight being the same as the time before (.909).
I then went back and collapsed two categories in one of the variables (getting the overall number from 36 (33+3) to 35 (32+3)) and generated a weight with six dimensions with 35 categories and everything ran fine.
Also when I rake - I generate the histogram to review and for the x-axis the distribution (for the 36 categories) was what I would expect but every value on the x-axis was .91.
------------------------------
David Winston
------------------------------
Original Message:
Sent: Wed November 16, 2022 09:33 PM
From: Jon Peck
Subject: Raking - Is there a category limit?
There is a limit of ten dimensions, but there is no limit on categories. However if the product of categories across the variables is large, there may be a lot of empty cells or cells with very small counts, so the results may become unuseful. I have had a client who had 1500 categories, though, and got satisfactory results..
With a lot of categories, it is easier to set them up as a dataset rather than using the dialog box. Here is an example of the syntax you might use for that.
SPSSINC RAKE FINALWEIGHT=WEIGHT /DS1 DS=jobcatds CATVAR=jobcat TOTVAR=totals /DS2 DS=minorityds CATVAR=minority TOTVAR=totals
Each dataset would have two columns giving the categories and counts or proportions.
The datasets might look like this.
I would recommend looking at the histogram of weights to see what the weighting will do.
data list list /jobcat value.
begin data
1 .5
2 .30
3 .20
end data.
dataset name jobcatds.
data list list/minority value.
begin data
0 .80
1 .20
end data.
dataset name minorityds.
--
Original Message:
Sent: 11/16/2022 8:16:00 PM
From: David Winston
Subject: Raking - Is there a category limit?
I am doing raking across several variables. When I use 35 categories everything runs fine (and I can go up to the limit of ten variables). When I go to 36 the resulting weights are almost exactly the same (even when using 5 variables). If I take one of the variables and collapse two categories to get the total number down to 35 - it works fine. Is there a limit to categories?
FYI - on a Mac with 64GB of memory and on a 64 bit only operating system. Thanks!
------------------------------
David Winston
------------------------------
#SPSSStatistics