Thanks Jon. I'll check out all your options.

Original Message:

Sent: 11/29/2022 10:51:00 AM

From: Jon Peck

Subject: RE: Applying a sampling weight

It looks like this reply I sent a few days ago did not post...

We will need to dig into this further, but there are two important points I want to make as I think you are drawing the wrong conclusions here.

1. If a weight is zero, the entire case is invisible to a procedure, so it will not contribute to anything - even the count.

2. Most of the procedures, including MEANS, do honor fractional weights, but you are seeing rounded numbers in some places, because the cell is being formatted with zero decimals based on the variable format, which means the displayed value is in effect rounded to an integer. But if you display more decimals in the pivot table, you will see that the fractional part due to the weight appears.

3. Some procedures, including NPAR TESTS, do round the weights, because they have to replicate the cases according to the weight in order to apply the algorithm.

NPAR TESTS: "If fractional weights have been specified, results for all methods will be calculated on the weight rounded to the nearest integer"

NPTESTS, on the other hand, does not round the weights IIRC.

------------------------------

Jon Peck

------------------------------

Original Message:

Sent: Wed November 23, 2022 10:16 AM

From: Jon Peck

Subject: Applying a sampling weight

This may be a good time to show your students that the world is a more complicated place than they thought :-)

Weighting is a complicated subject, and statisticians are not all in agreement on how weights should be used. There is a section called "Using Rake-Weighted Data in SPSS Statistics Procedures" in the Raking with IBM SPSS Statistics.pdf file installed with the SPSSINC RAKE extension command that might be helpful.

Weights can arise for several different reasons, including matching control totals, complex samples, importance, and heteroscedastic errors to name a few, and procedures have different ways of handling them. This shows up mainly when weights are not integers as would be the case after raking.

CROSSTABS offers five methods for handling weights but by default rounds the cell counts. FREQUENCIES uses the fractional values as is, however the formatting of the counts shows zero decimal places if the variable format has zero decimals. If you expand the decimals in the pivot table or the variable format setting, you would see the fractional values.

Some of the nonparametric procedures work by replicating cases according to the weights, so they have to round the weights.

The complex samples procedures treat the weights as arising from the sampling scheme, which means a sampling design must be specified.

In regression, the weights might also be used for heteroscedasticity correction, so the fractional values would be used.,

One important improvement in CTABLES is that it can use either the usual weight as set by the WEIGHT CASES command, but it can also use effective base weighting, which ignores the weight set by WEIGHT CASES and uses an approximation for sampling weights originally specified by Kish without the need for a sampling plan. An example is included int he document mentioned above.

Overall, tabulations in general can use the fractional weights as they are, but modeling procedures such as regression might do things differently if these are sampling weights. Gelman suggests not weighting the data at all but including the variables that determine the weights as additional regressors in the estimation process.

What I tell people understandably uncertain about what to do in the modeling scenario is to experiment with the various possibilities remembering that the underlying assumption in regression is that the same model applies to all the cases, so if the weights make a big difference in the coefficient estimates, that suggests that there are problems with the model.

I hope this is helpful.

------------------------------

Jon Peck

Original Message:

Sent: Tue November 22, 2022 06:17 PM

From: Peter Galderisi

Subject: Applying a sampling weight

Jon,

Thanks for the incredibly quick reply. I actually have been using the WEIGHT command and still getting this problem. I'll take a look at Complex Samples.

P

Original Message:

Sent: 11/22/2022 3:07:00 PM

From: Peter Galderisi

Subject: RE: Applying a sampling weight

Jon,

I'M having difficulty when weighted data are used with certain analysis procedures:

- Without applying weights, the frequencies I get for a 2-category variable (total and for each category) are consistent whether I run a simple frequencies distribution, a nonparametric binomial test, or a one-sample proportions test (as part of a MEANs procedure). The same is true for multiple category NPAR one-variable chisquare.
- With weights applied, the frequencies (N) I get for the latter three are consistent but different from those obtained with a simple frequencies distribution. It's difficult telling students to test to see if their actual frequencies are significantly different from a hypothesized distribution with this happening. When I used these statistics many years ago (version 22?), I didn't have this problem.

I'm assuming this has to do with running trials (and weights get applied differently) but I would like to ask my learned colleague for advice.

Ideas? Many thanks.

Peter

NO WEIGHT:

------------------------------

Peter Galderisi

Original Message:

Sent: Fri July 30, 2021 08:54 AM

From: Jon Peck

Subject: Applying a sampling weight

If you want simple replication or frequency weights, you can just use the WEIGHT command to set the weight variable. For complex samples, you would have a weight variable that reflects the sampling probabilities, which is different from a replication weight.

The CTABLES procedure gives you the option of using the weight as specified by the WEIGHT command, which is a frequency weight, or effective base weighting.

--

Original Message:

Sent: 7/29/2021 10:40:00 AM

From: Claire Tyers

Subject: Applying a sampling weight

Hi all

I need to apply a sampling weight to my data then run a range of basic frequencies and statistics on the weighted data. I soon realised that this isn't possible in the base package so have now purchased complex samples but I'm struggling.

I just want to apply a weight that has already been calculated. The sample isn't stratified or complex, it's just that we got greater participation in a survey from some respondents than others. The survey was originally sent out to an entire population.

I have created a plan file with no strata or clusters but with a sample weight added. I selected WR estimation method.

When I then run complex samples descriptives, the means that are produced are completely different a colleague using R. R, as I understand it just multiplies the weight by the value of the numerical variable. I can replicate the results when I do this manually in excel.

What am I doing wrong? Any help would be much appreciated as a newbie to Complex Samples.

Many thanks

------------------------------

Claire Tyers

------------------------------