SPSS Statistics

SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers! 

 View Only
  • 1.  Strange result of Box-Cox transform by ADP

    Posted Sun September 22, 2024 06:23 AM

    Automatic Data Preparation (ADP) can apply Box-Cox transformation to target variable to make a skewed one more normally distributed.

    I am observing a strange, incongruous Box-Cox result by ADP, looking to me like a bug. ADP could not find the optimal lambda parameter of the transform in a case where it surely ought to find, since the data is simple and the optimal lambda value lies on the grid search ADP utilizes. Below is an example.

    *Create first a standard normal variate X.
    SET RNG=MC SEED=5676954.
    input prog.
    loop #case= 1 to 1000.
     comp x= normal(1).
     end case.
    end loop.
    end file.
    end input prog.
    exec.
    dataset name data.
    *and, after shifting it to the right so all its values are positive...
    AGGREGATE
      /OUTFILE=* MODE=ADDVARIABLES
      /x_min=MIN(x).
    comp x= x-x_min+1.
    exec.
    *... create the right-skew variate Y as 1/X.
    comp y= 1/x.
    exec.

    *Y is the variate we want to apply Box-Cox transform to.
    *Use ADP to do it.
    comp factor= rnd(uniform(1)). /*(a factor variable, won't be used - just ADP needs it to mention)
    exec.
    var lev factor (nom).
    *ADP doing only Box-Cox.
    ADP
      /FIELDS TARGET=y INPUT=factor 
      /PREPDATETIME DATEDURATION=NO TIMEDURATION=NO EXTRACTYEAR=NO EXTRACTMONTH=NO EXTRACTDAY=NO 
        EXTRACTHOUR=NO EXTRACTMINUTE=NO EXTRACTSECOND=NO
      /SCREENING PCTMISSING=NO UNIQUECAT=NO SINGLECAT=NO
      /ADJUSTLEVEL INPUT=NO TARGET=NO
      /OUTLIERHANDLING INPUT=NO TARGET=NO
      /REPLACEMISSING INPUT=NO TARGET=NO 
      /REORDERNOMINAL INPUT=NO TARGET=NO
      /RESCALE INPUT=NONE TARGET=BOXCOX(MEAN=0 SD=1)
      /TRANSFORM MERGESUPERVISED=NO MERGEUNSUPERVISED=NO BINNING=NONE SELECTION=NO CONSTRUCTION=NO
      /CRITERIA SUFFIX(TARGET='_transformed' INPUT='_transformed')
      /OUTFILE PREPXML='C:\Temp\spssadp_automatic.tmp'.
    TMS IMPORT
      /INFILE TRANSFORMATIONS='C:\Temp\spssadp_automatic.tmp'  MODE=FORWARD (ROLES=UPDATE)
      /SAVE TRANSFORMED=YES
      /OUTFILE SYNTAX='C:\Temp\TransSyntax.sps'.
    GRAPH
      /HISTOGRAM(NORMAL)=Y_transformed.
    *The Y_transformed variate is not much more normal than Y is.
    *And, from the syntax file TransSyntax.sps we can learn that the lambda parameter used was: -3.

    *However, the obviously best lambda should be -1, which tranforms Y "back" to a normal variate.
    *ADP does grid search for the best lambda from -3 to 3 by step 0.5, so it must find lambda=-1.
    *Why did it stick to (the incorrect) lambda=-3 instead?

    I hope SPSS statisticians or ADP developers will come to answer it.



    ------------------------------
    Kirill Orlov
    ------------------------------


  • 2.  RE: Strange result of Box-Cox transform by ADP

    Posted Sun September 22, 2024 10:48 AM

    Using STATS PREPROCESS VARIABLES,

    STATS PREPROCESS VARIABLES=y ID=id
    SORT=NO DATASET=normal 
    /NONLINEAR DONONLINEAR=YES  METHOD=BOXCOX .

    I get a lambda of 1.006.  If I do Yeo-Johnson,

    STATS PREPROCESS VARIABLES=y ID=id
    SORT=NO DATASET=normalyj 
    /NONLINEAR DONONLINEAR=YES  METHOD=YEOJOHNSON .

    I get a lambda of -1.759.  The histograms look slightly different, but this suggests that there result is not very sensitive to the transformation parameter.   (Image attached)



    ------------------------------
    Jon Peck
    Data Scientist
    JKP Associates
    Santa Fe
    ------------------------------



  • 3.  RE: Strange result of Box-Cox transform by ADP

    Posted Sun September 22, 2024 11:57 AM

    Carrying this a little further, here are the histograms and boxplots including the ADP output.

    It's clear that the ADP tranformation didn't work well.



    ------------------------------
    Jon Peck
    Data Scientist
    JKP Associates
    Santa Fe
    ------------------------------



  • 4.  RE: Strange result of Box-Cox transform by ADP

    Posted Sun September 22, 2024 12:09 PM

    Jon, did you get BoxCox lambda -1.006 rather than 1.006?



    ------------------------------
    Kirill Orlov
    ------------------------------



  • 5.  RE: Strange result of Box-Cox transform by ADP
    Best Answer

    Posted Sun September 22, 2024 01:49 PM
    Oops.  I need a bigger font :-)
    -1.006483356943868

    --