Original Message:
Sent: Sat July 29, 2023 05:00 AM
From: Bindu Krishnan
Subject: A remark on one-sample Wilcoxon test in SPSS
I too agreed to the note by Jon that the distribution is symmetric about the median in one sample Wilcoxon test. Further, the distribution of the differences between sample values and the hypothesized median should be symmetric in such test. 'Non-parametric test' is commonly known as 'distribution free test' because of the reason that they are not constrained by assumptions about the distribution of the population. That is, it does not assume any specific distribution. The attached document discusses more on non parametric inference procedures.
------------------------------
Bindu Krishnan
Original Message:
Sent: Thu July 27, 2023 12:48 PM
From: Kirill Orlov
Subject: A remark on one-sample Wilcoxon test in SPSS
Jon, and that is what I was saying.
The logic is: We know that one-sample Wilcoxon is just a paired-samples Wilcoxon with one of the two variables set to a constant (equal to the tested value). Paired-samples Wilcoxon - it is well known - tests the H0 that the distribution of the differences var1-var2 is symmetric about 0. That means, that one-sample Wilcoxon tests H0 that the distribution of var is symmetric about the test value.
Speaking differently, H0 that the sum of two randomly chosen differences from that value has equal probability be + or -. Speaking once again equivalently, H0 that the average of two randomly chosen observations has equal probability to be below or above that value.
This H0 can be violated (towards H1) because of two reasons: (1) distribution is symmetric about a different value than the test value; (2) distribution is not symmetric.
If we assume the distribution is symmetric, only reason (1) plays, and one-sample Wilcoxon becomes the test of the median (and then also of the mean) of any symmetric distribution (not necessarily normal or otherwise specific). If we assume the distribution is asymmetric, the test looks purposeless because it contradicts its own H0, doesn't it?
The following piece of code tries to play around the H0 (as stated in my 2nd paragraph) and the ways it can be violated.
SET RNG=MT MTINDEX=RANDOM.
set mxloops 1E9.
matrix.
*Let these be possible values of our variable. Say, likert scale of 11 values.
*Let they be equal probable in this example (uniform distribution).
comp vals= {0,1,2,3,4,5,6,7,8,9,10}. /*Real median in this example =5
print vals.
comp testval= 5. /*Let this be our test value (hypothizided median)
*Perform many random samplings.
loop try= 1 to 100000. /*say, 100000
* Randomly select a pair of values from vals.
comp rand= uniform(1,11).
comp min= mmin(rand).
comp max= mmax(rand).
comp ind1= mmax((rand=min) &* (1:11)).
comp ind2= mmax((rand=max) &* (1:11)).
comp val1= vals(ind1). /*Randomly chosen one value
comp val2= vals(ind2). /*Randomly chosen another value
* Compute the sum of their deviations from the testval and check its sign.
* (It is the same as to check the sign of the deviation of the average
* of the two values from the testval.
comp sumdev= (val1-testval)+(val2-testval). /*Sum of two deviations
comp sign= (sumdev>0)-(sumdev<0). /*The sign is either -1, +1, or 0
*PRINT {val1,val2,sign}.
save sign /out= * /var= sign. /*Save to dataset
end loop.
end matrix.
frequencies sign. /*Frequencies; Are proportions of "-1" and "+1" equal?
You can see that percent of "-1" and "+1" tend to equality.
Now set the test value to another, say, 6. There will be no equality of the two percents anymore. The test value is not equal to the real median and thus the H0 is violated for reason (1), one-sample Wilcoxon is expected to be significant for this pattern.
Now set the test value back to 5, but make the distribution of values somewhat asymmetric (without changing the median=5). For example, use {0,1,2,3,4,5,6,8,8,10,10}. The distribution shape is not symmetric and thus the H0 is violated for reason (2), one-sample Wilcoxon is expected to be significant for this pattern.
If I am correct (but maybe I am not), then I think SPSS should (a) update the documentation on one-sample Wilcoxon test; (b) add one-sample Median test as an explicit option. This test is a weaker but more general alternative to the Wilcoxon, it doesn't assume distribution shape symmetry, and it is indeed equivalent to the Sign test with one of the two variables set to the test value. That what was my post about.
------------------------------
Kirill Orlov
Original Message:
Sent: Thu July 27, 2023 11:58 AM
From: Jon Peck
Subject: A remark on one-sample Wilcoxon test in SPSS
My understanding is that the Wilcoxon test does assume that each distribution is symmetric about the median but does not assume any specific distribution.
This link discusses the one-sample sign test as an alternative.
--
Original Message:
Sent: 7/27/2023 11:32:00 AM
From: Kirill Orlov
Subject: RE: A remark on one-sample Wilcoxon test in SPSS
Bindu, please, just comment then on the following obvious experiment.
*Create a population of quite skewed distribution.
*Say, Uniform(0,1)^3. The median of such population = 0.5^3 = 0.125.
SET RNG=MT MTINDEX=RANDOM.
matrix.
save uniform(1000000,1) /out= * /var= var.
end matrix.
compute var= var**3.
FREQUENCIES VARIABLES=var /FORMAT=NOTABLE /STATISTICS=MEDIAN. /*Median in the population
*Draw a sample (of size about 500 or whatever you like) from the population.
SAMPLE .0005.
GRAPH /HISTOGRAM=var.
FREQUENCIES VARIABLES=var /FORMAT=NOTABLE /STATISTICS=MEDIAN. /*Median in the sample
*Test the sample against the median 0.125.
NPTESTS /ONESAMPLE TEST (var) WILCOXON(TESTVALUE=0.125).
*The p-value is very significant.
Why is it significant (and so much!), if one-sample Wilcoxon considered as a test of median is - as you say - "distribution free" in the sense that it does not need the assumption of symmetric population?
------------------------------
Kirill Orlov
Original Message:
Sent: Thu July 27, 2023 04:58 AM
From: Bindu Krishnan
Subject: A remark on one-sample Wilcoxon test in SPSS
The one-sample Wilcoxon test is a non-parametric statistical test used to determine if there is a significant difference between the median of a single sample and a hypothesized median value. The population distribution need not to be symmetric in one-sample Wilcoxon test because this test is a non-parametric test (or distribution free test), which means it does not assume any specific distribution for the population from which the sample is drawn. But here we assume that the distribution of the differences between the sample values and the hypothesized median to be symmetric. The Wilcoxon test works by ranking the absolute differences between each observation in the sample and the hypothesized median value. Since the test is based on ranks rather than the actual values, it does not rely on the underlying distribution being symmetric.
If we want to test whether the mean of a single sample differs significantly from a specific value, then we use one sample t test which is purely a parametric test and assumption of normality is required there.
------------------------------
Bindu Krishnan
Original Message:
Sent: Tue July 25, 2023 01:32 PM
From: Kirill Orlov
Subject: A remark on one-sample Wilcoxon test in SPSS
One-sample Wilcoxon Test is available in "New nonparametric tests" (NPTESTS command). This test is equivalent to the Paired-samples Wilcoxon with var1 = the sample vs var2 = constant variable equal to the test value. The test tests H0 that the population median equals the specified test value.
However, the documentation (both Help and CSR and Algorithms) miss to mention the important assumption of the test: the population distribution is symmetric shape. In general, the test's H0 is that the distribution in the population is symmetric about the suggested test value. The H0 can be violated either because (1) the distribution is symmetric about a different value or (2) the distribution is not symmetric per se. One needs to assume the symmetric shape in order the test be the test of median. The documentation is silent of this, which is dangerous for users. Get a variable (data) of very skewed distribution and test it against its own observed median, the test will be very significant!.
When the symmetry assumption is made, one-sample Wilcoxon is a powerful nonparametric alternative to the one-sample t-test w.r.t. both median and mean, since in a symmetric distribution of continuous population mean and median coincide.
If the population is definitely asymmetric shape (so that the symmetry assumption is inapplicable), one can use more general (though less powerful) one-sample Median test instead. The one-sample Median test is not found in SPSS as a special option, but it is fully equivalent to the Sign test between var1 = the sample vs var2 = constant variable equal to the test value.
So (consider please as a feature request):
1) Why not make more accurate the documentation of Wilcoxon signed-rank test (both one sample and paired-samples) by mentioning the assumption of the symmetry of the differences, necerssary for the test to become specifically the test of the mean or median difference, rather than of generic stochastic declination?
2) Why not add one-sample Median test (alongside the one-sample Wilcoxon test) to One Sample nonparametric tests? The test does not assume symmetric population.
3) Please consider expanding the capability of one-sample Median test for it to be able to test nonparametrically not only median, but an arbitrary quantile. This is certainly possible, though I don't have currently a formula or a reference, to suggest.
I hope that SPSS statisticians will read my thread and reply. And if I was mistaken or you disagree in some respect - please let me know.
P.S. By "test value" I mean "the tested value", the hypothesized median. Not the value of a test statistic.
------------------------------
Kirill Orlov
------------------------------