SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers!

View Only

Back to discussions

Expand all | Collapse all

Strange result of Box-Cox transform by ADP

Jump to Best Answer

1. Strange result of Box-Cox transform by ADP

Like
Kirill Orlov
Posted Sun September 22, 2024 06:23 AM

Reply
Automatic Data Preparation (ADP) can apply Box-Cox transformation to target variable to make a skewed one more normally distributed.

I am observing a strange, incongruous Box-Cox result by ADP, looking to me like a bug. ADP could not find the optimal lambda parameter of the transform in a case where it surely ought to find, since the data is simple and the optimal lambda value lies on the grid search ADP utilizes. Below is an example.

*Create first a standard normal variate X.
SET RNG=MC SEED=5676954.
input prog.
loop #case= 1 to 1000.
comp x= normal(1).
end case.
end loop.
end file.
end input prog.
exec.
dataset name data.
*and, after shifting it to the right so all its values are positive...
AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/x_min=MIN(x).
comp x= x-x_min+1.
exec.
*... create the right-skew variate Y as 1/X.
comp y= 1/x.
exec.

*Y is the variate we want to apply Box-Cox transform to.
*Use ADP to do it.
comp factor= rnd(uniform(1)). /*(a factor variable, won't be used - just ADP needs it to mention)
exec.
var lev factor (nom).
*ADP doing only Box-Cox.
ADP
/FIELDS TARGET=y INPUT=factor
/PREPDATETIME DATEDURATION=NO TIMEDURATION=NO EXTRACTYEAR=NO EXTRACTMONTH=NO EXTRACTDAY=NO
EXTRACTHOUR=NO EXTRACTMINUTE=NO EXTRACTSECOND=NO
/SCREENING PCTMISSING=NO UNIQUECAT=NO SINGLECAT=NO
/ADJUSTLEVEL INPUT=NO TARGET=NO
/OUTLIERHANDLING INPUT=NO TARGET=NO
/REPLACEMISSING INPUT=NO TARGET=NO
/REORDERNOMINAL INPUT=NO TARGET=NO
/RESCALE INPUT=NONE TARGET=BOXCOX(MEAN=0 SD=1)
/TRANSFORM MERGESUPERVISED=NO MERGEUNSUPERVISED=NO BINNING=NONE SELECTION=NO CONSTRUCTION=NO
/CRITERIA SUFFIX(TARGET='_transformed' INPUT='_transformed')
/OUTFILE PREPXML='C:\Temp\spssadp_automatic.tmp'.
TMS IMPORT
/INFILE TRANSFORMATIONS='C:\Temp\spssadp_automatic.tmp' MODE=FORWARD (ROLES=UPDATE)
/SAVE TRANSFORMED=YES
/OUTFILE SYNTAX='C:\Temp\TransSyntax.sps'.
GRAPH
/HISTOGRAM(NORMAL)=Y_transformed.
*The Y_transformed variate is not much more normal than Y is.
*And, from the syntax file TransSyntax.sps we can learn that the lambda parameter used was: -3.

*However, the obviously best lambda should be -1, which tranforms Y "back" to a normal variate.
*ADP does grid search for the best lambda from -3 to 3 by step 0.5, so it must find lambda=-1.
*Why did it stick to (the incorrect) lambda=-3 instead?

I hope SPSS statisticians or ADP developers will come to answer it.

------------------------------
Kirill Orlov
------------------------------
2. RE: Strange result of Box-Cox transform by ADP

Like
Jon Peck

IBM Champion
Posted Sun September 22, 2024 10:48 AM

Reply
Using STATS PREPROCESS VARIABLES,

STATS PREPROCESS VARIABLES=y ID=id
SORT=NO DATASET=normal
/NONLINEAR DONONLINEAR=YES METHOD=BOXCOX .

I get a lambda of 1.006. If I do Yeo-Johnson,

STATS PREPROCESS VARIABLES=y ID=id
SORT=NO DATASET=normalyj
/NONLINEAR DONONLINEAR=YES METHOD=YEOJOHNSON .

I get a lambda of -1.759. The histograms look slightly different, but this suggests that there result is not very sensitive to the transformation parameter. (Image attached)

------------------------------
Jon Peck
Data Scientist
JKP Associates
Santa Fe
------------------------------

Original Message
3. RE: Strange result of Box-Cox transform by ADP

Like
Jon Peck

IBM Champion
Posted Sun September 22, 2024 11:57 AM

Reply
Carrying this a little further, here are the histograms and boxplots including the ADP output.

It's clear that the ADP tranformation didn't work well.

------------------------------
Jon Peck
Data Scientist
JKP Associates
Santa Fe
------------------------------

Original Message
4. RE: Strange result of Box-Cox transform by ADP

Like
Kirill Orlov
Posted Sun September 22, 2024 12:09 PM

Reply
Jon, did you get BoxCox lambda -1.006 rather than 1.006?

------------------------------
Kirill Orlov
------------------------------

Original Message
5. RE: Strange result of Box-Cox transform by ADP
Best Answer

Like
Jon Peck

IBM Champion
Posted Sun September 22, 2024 01:49 PM

Reply
Oops. I need a bigger font :-)
-1.006483356943868

--
Jon K Peck
jkpeck@gmail.com

Original Message

SPSS Statistics

SPSS Statistics

Strange result of Box-Cox transform by ADP

Kirill OrlovSun September 22, 2024 06:23 AM

Jon PeckSun September 22, 2024 10:48 AM

Jon PeckSun September 22, 2024 11:57 AM

Kirill OrlovSun September 22, 2024 12:09 PM

Jon PeckSun September 22, 2024 01:49 PMBest Answer

1. Strange result of Box-Cox transform by ADP

2. RE: Strange result of Box-Cox transform by ADP

3. RE: Strange result of Box-Cox transform by ADP

4. RE: Strange result of Box-Cox transform by ADP

5. RE: Strange result of Box-Cox transform by ADP
Best Answer

Additional
Resources

Office

Quick Links

SPSS Statistics

SPSS Statistics

Strange result of Box-Cox transform by ADP

Kirill OrlovSun September 22, 2024 06:23 AM

Jon PeckSun September 22, 2024 10:48 AM

Jon PeckSun September 22, 2024 11:57 AM

Kirill OrlovSun September 22, 2024 12:09 PM

Jon PeckSun September 22, 2024 01:49 PMBest Answer

1. Strange result of Box-Cox transform by ADP

2. RE: Strange result of Box-Cox transform by ADP

3. RE: Strange result of Box-Cox transform by ADP

4. RE: Strange result of Box-Cox transform by ADP

5. RE: Strange result of Box-Cox transform by ADP Best Answer

Additional Resources

Office

Quick Links

5. RE: Strange result of Box-Cox transform by ADP
Best Answer

Additional
Resources