SPSS Statistics

 View Only
Expand all | Collapse all

Transformation Macro

  • 1.  Transformation Macro

    Posted Wed March 05, 2025 08:23 AM

    Hello, data people!

     

    I need to apply transformations to a series of variables, in 0.25 increments (0.25 to 4).

    The idea is to take a series of variables and 'COMPUTE' them to a set of new variables that are transformed with the same basic calculation that is increased by 0.25.

    I tried:

     

    DEFINE BCT (PRED !TOKENS(1) / RESULT !TOKEN(1) / FIRST !TOKEN(1) / LAST !TOKEN(1)

    !DO !I = !FIRST !TO !LAST !BY 0.25

    !LET !VAR= !CONCAT (!RESULT, !I)

    COMPUTE !VARE(!PRED**!I-1)/!I.

    EXECUTE.

    !DOEND.

    !ENDDEFINE

    BCT PRED= PREDICOR1 PREDICOR2 PREDICOR3 RESULT=RES FIRST=0.25 LAST=4

     

    I get multiple errors stating that "The DEFINE command includes an invalid keyword specification.

     

    Then I tried AI to suggest a macro with no real results:

     

    DEFINE BCT (PRED = !CHAREND('/')

                /RESULT = !CHAREND('/')

                /FIRST = !CHAREND('/')

                /LAST = !CHAREND('/'))

     

    !LET !PREDCOUNT = 3

     

    !DO !J = 1 !TO !PREDCOUNT

      !LET !CURRENTPRED = !UNQUOTE(!EXTRACT(!J, !PRED))

     

     

      !DO !I = !UNQUOTE(!FIRST) !TO !UNQUOTE(!LAST) !BY 0.25

       

        !LET !VAR = !CONCAT(!RESULT, !CURRENTPRED, "_No", !I)

       

        

        COMPUTE !VAR = (!CURRENTPRED**!I - 1) / !I.

      !DOEND

    !DOEND

     

    EXECUTE.

    !ENDDEFINE.

     

     

    Does anyone has an idea what's wrong with my macro?

     

              

    Meni Berger |

    Data Scientist and Head of Tech  Support

    Email  -  Meni@genius.co.il

    11 Menachem Begin st.,  Ramat Gan

    www.genius.co.il

    Click here to open a support ticket  

    Title: LinkedIn - Description: image of LinkedIn icon

     

     



  • 2.  RE: Transformation Macro

    Posted Wed March 05, 2025 02:17 PM
    Edited by David Dwyer Wed March 05, 2025 02:17 PM

    Hi @Meni Berger,

    You have syntax errors in your macro definition code.  For example: !TOKENS(num) is the macro function.  You have !TOKEN(num) in several places.  You're missing a closing parenthesis as well.  This is all in the first line.

    DEFINE BCT (PRED !TOKENS(1) / RESULT !TOKEN(1) / FIRST !TOKEN(1) / LAST !TOKEN(1)

    should maybe be

    DEFINE BCT (PRED !TOKENS(1) / RESULT !TOKENS(1) / FIRST !TOKENS(1) / LAST !TOKENS(1) )

    But is the macro language really even the way you want to go with this?. Note that the SPSS Statistics macro language is for manipulating SPSS Statistics command syntax.  Thus, at its heart it is a language for manipulating strings.  Those strings, if you do it right, turn out to be legitimate and valid SPSS Statistics command syntax commands.

    If you are committed to using the macro language to accomplish your goal, perhaps start with a sample dataset and then write the command syntax for the manipulations you want to do.  Once you have command syntax that does what you want, then look for ways to condense and iterate it via the macro language.

    I've taken your example macro and made it syntactically correct (if not logically correct). 

    DATASET CLOSE ALL.
    NEW FILE.
    OUTPUT CLOSE ALL.
     
    DEFINE @BCT (PRED !TOKENS(3) / RESULT !TOKENS(1) / FIRST !TOKENS(1) / LAST !TOKENS(1))
    !DO !I = !FIRST !TO !LAST !BY 0.25
    !LET !VAR= !CONCAT (!RESULT, !I)
    COMPUTE !VAR (!PRED**!I-1)/!I.
    EXECUTE.
    !DOEND.
    !ENDDEFINE.
     
    SET PRINTBACK ON MPRINT ON.
    @BCT PRED= PREDICOR1 PREDICOR2 PREDICOR3 RESULT=RES FIRST=0.25 LAST=4.
    SET MPRINT OFF.

    Note the SET MPRINT ON. diagnostic statement.  This setting will show you the command syntax being generated by the macro.  For troubleshooting, I strongly recommend using SET MPRINT ON. and SET MPRINT OFF. around your macro call so that you can see exactly what your macro is doing at every step.

    For more details on using the macro language, see the SPSS Statistics Command Syntax Reference Guide in the DEFINE--!ENDDEFINE section.

    Ultimately you may decide that your particular problems needs a completely different approach.  If you are creating data from known starting points, perhaps you want to use INPUT PROGRAM. -- END INPUT PROGRAM. instead.

    Perhaps you might want to just write the core command syntax and then iterate it with Python .

    Begin with the literal command syntax that does what you want first.  Then move on from there.



    ------------------------------
    David Dwyer
    Global Solution Consultant
    IBM
    ------------------------------



  • 3.  RE: Transformation Macro

    Posted Mon March 10, 2025 10:29 AM

    Thank you for your code offerings @David Dwyer and @Suman Suhag.

    Unfortunately, none of the suggestions you wrote to me are functioning. I've tried @David Dwyer  suggestions to manage the logical part of the commands over a regular Syntax, with no success. 

    I've also tried using AI (ChatGpt and Claude) to generate this Macro code. Surprisingly, it failed, too, no matter how tenacious I was.

    I could not find any viable example of such a Macro feat on the web. I was also surprised by how few Macro code examples are available on the web. looks like at some point Macro became deprecated and that's why AI is having such an issue figuring it out and rendering a working code.



    ------------------------------
    Meni Berger
    ------------------------------



  • 4.  RE: Transformation Macro

    Posted Mon March 10, 2025 03:02 PM

    Hi @Meni Berger
    It's true.  The SPSS macro language has not had significant (if any) development since about SPSS 6.1.4 for Windows. When we introduced Python Programmability in SPSS 14.0, the intent was for users to ultimately migrate themselves and their work to Python.  The main strengths of the macro language lies in substitution and iteration to build legitimate SPSS command syntax and execute it.  Hopefully you agree, Python is more versatile in this regard and add functionality as well.

    Whichever way you go implementing your approach, can you articulate what it is you are trying to do? 

    • What would the dataset look like before your manipulations?
    • What would it look like after your manipulations?
    • What hard rules exist that you might use in SPSS Command Syntax to cary out the manipulations -- in short, if you didn't have a computer to do this for you, how would you go about doing it manually?


    ------------------------------
    David Dwyer
    Global Solution Consultant
    IBM
    ------------------------------



  • 5.  RE: Transformation Macro

    Posted Tue March 11, 2025 08:32 AM

    I am working for a client that deploys SPSS on a military network. Introducing Python to his environment was very difficult, so he decided to use whatever he had in his toolbox, meaning Macro language. I am trying to convince him to make the transition to Python. I hope to succeed in that feat.

    Speaking of Python- there are even fewer examples of Python implementation in SPSS than there are of Macro.

    Anyway, I have a set of skewed input variables that need to be normalized for modeling.

    For each variable, a set of Box-Cox transformations is applied. I decided to set Lambda to 0.25 increments, starting with 0.25 up to 4.

    That will produce a set of 16 new transformed variables for each original skewed input variable.

    For each transformed set, I will perform a Frequency procedure (/STATISTICS=SKEWNESS SESKEW KURTOSIS SEKURT) to see which of the Lambda values yields the best SKEWNESS and KURTOSIS (closest to normal).

    After that, I will manually select that desired transformed variable and delete the rest of the transformed series.

     

     

              

    Meni Berger |

    Data Scientist and Head of Tech  Support

    Email  -  Meni@genius.co.il

    11 Menachem Begin st.,  Ramat Gan

    www.genius.co.il

    Click here to open a support ticket  

    Title: LinkedIn - Description: image of LinkedIn icon

     

     






  • 6.  RE: Transformation Macro

    Posted Tue March 11, 2025 09:51 AM

    Hello Meni.  You wrote:  "Anyway, I have a set of skewed input variables that need to be normalized for modeling."

    What kind of model(s) are you talking about, and what role do the variables you wish to transform play in those models?  I ask because very often, people may think that a model requires some variable(s) to be normally distributed when in fact there is no such assumption.  That makes me wonder if your initial question is an example of the so-called XY problem

    Meanwhile, following up on what @David Dwyer asked, I think it would be helpful to start with some ordinary code (no macros) that carries out the transformations you want, and then convert it into macro code.  With that in mind, does the following code do the transformations you want?

    NEW FILE.
    DATASET CLOSE ALL.
    OUTPUT CLOSE ALL.
    
    DATA LIST FREE / X (F1).
    BEGIN DATA
    1 2 3 4 5 6 7
    END DATA.
    
    * COMPUTE !VAR (!PRED**!I-1)/!I.
    
    COMPUTE Res0.25 = (X**0.25-1)/0.25.
    COMPUTE Res0.50 = (X**0.50-1)/0.50.
    COMPUTE Res0.75 = (X**0.75-1)/0.75.
    COMPUTE Res1.00 = (X**1.00-1)/1.00.
    COMPUTE Res1.25 = (X**1.25-1)/1.25.
    COMPUTE Res1.50 = (X**1.50-1)/1.50.
    COMPUTE Res1.75 = (X**1.75-1)/1.75.
    COMPUTE Res2.00 = (X**2.00-1)/2.00.
    COMPUTE Res2.25 = (X**2.25-1)/2.25.
    COMPUTE Res2.50 = (X**2.50-1)/2.50.
    COMPUTE Res2.75 = (X**2.75-1)/2.75.
    COMPUTE Res3.00 = (X**3.00-1)/3.00.
    COMPUTE Res3.25 = (X**3.25-1)/2.25.
    COMPUTE Res3.50 = (X**3.50-1)/2.50.
    COMPUTE Res3.75 = (X**3.75-1)/2.75.
    COMPUTE Res4.00 = (X**4.00-1)/3.00.
    FORMATS Res0.25 TO Res4.00 (F5.2). 
    LIST.

    Cheers,
    Bruce



    ------------------------------
    Bruce Weaver
    ------------------------------



  • 7.  RE: Transformation Macro

    Posted Tue March 11, 2025 11:41 AM

    Hello, @Bruce Weaver

    Thank you for the code, although I am looking for a more elegant and compact solution, which involves the looping through values. for example !DO 0.25 !TO 4 !BY 0.25.

    Regarding the XY issue, the variables are to be used as predictors in OLS Stepwise Regression and as a general template for other procedures that require normalizing variables. 



    ------------------------------
    Meni Berger
    ------------------------------



  • 8.  RE: Transformation Macro

    Posted Tue March 11, 2025 12:42 PM
    Edited by Bruce Weaver Tue March 11, 2025 12:55 PM
      |   view attached

    Hello @Meni Berger.  I understand that you want more compact code.  But I first wanted to make sure that we all understood what you want the basic code at the core of the macro to be doing.  Given your response, I assume my (non-macrotized) code did that.  ;-) 

    Second, OLS linear regression does not assume or require that any of the variables (explanatory or outcome) be normally distributed.  This is a myth.  The necessary normality assumption is that the sampling distributions of the parameter estimates (i.e., the coefficients) are approximately normal.  And a sufficient normality assumption is that the errors are approximately normally distributed.  I've attached a small set of slides that might be helpful. 

    Third, in many contexts, use of "stepwise" selection is ill-advised.  Here are a couple of relevant resources:

    I hope this helps.

    EDIT:  I could not see the PDF I attached, so I've uploaded it to ResearchGate.  You can view it here



    ------------------------------
    Bruce Weaver
    ------------------------------

    Attachment(s)



  • 9.  RE: Transformation Macro

    Posted Tue March 11, 2025 01:18 PM
    Underlining Bruce's point, the regressors are not even considered to have a distribution - they are not random variables,  And it is only the conditional distribution of the dependent variable, i.e., the error term, that needs a normality assumption.


    --





  • 10.  RE: Transformation Macro

    Posted Thu March 13, 2025 10:37 AM

    Hi @Bruce Weaver. thank you for the resources and advice.

    I am familiar with formal OLS assumptions and the methods used to verify them.

    The current data asset with its 42 predictors and 211 observations, presents many issues: I have moderate to low correlations, poor heteroskedasticity, and the error is barely normal (normality tests and QQ plots). I had no time to bootstrap my predictors to derive conclusions about the sampling distribution of parameter estimates so, I relied on the standard errors, which proved to be a bad idea, as most of the predictors are not significant. haven't got into Multicoliniarty measures yet...

    As a last-ditch effort a thought, I'll try to transform the predictors. first I used base-e log (compute LN) which had little positive effect. I also have a Box-Cox macro I am working on for another project, so why not give B-C a go as everything is already Fubar? and here we are now.

     Sunday morning I might get some more predictors and observations so there is a little hope. If this path does not go well, the next step is modeling using Decision Trees.

    Regarding Prof. Frank Harrell's remarks about the "stepwise" method. a real prophet of wrath with his ten commandments of " thou shall not maketh stepwise regression". I promise to redirect my efforts to more advanced feature selection techniques such as <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>l</mi><mn>1</mn></msub></math> regularized regression (LASSO) :)



    ------------------------------
    Meni Berger
    ------------------------------



  • 11.  RE: Transformation Macro

    Posted Tue March 11, 2025 09:56 AM
    Maybe you should consider the STATS PREPROCESS extension command.  It can apply Box-Cox or Yeo-Johnson or other distribution adjustments to a whole batch of variables in one command.  No need to try different lambdas as it will pick the best for you.

    As for Python examples, you might find the Python (and R) book available for download via Help > Doc in PDF Format.  It's all about the basics of using Python with SPSS.  Here's a link to the book

    The book examples  are still written in Python 2, but except for the print command, which is a function in Python 3 instead of a statement, they almost all still work.  I converted them to Python 3  and can send you a zip of those if you want to go that way.


    --





  • 12.  RE: Transformation Macro

    Posted Wed March 12, 2025 10:48 AM

    I've also looked for documentation on SPSS macros.  I know there is some on the SPSS tools site, but it does seem sparse.  



    ------------------------------
    Art Jack
    ------------------------------



  • 13.  RE: Transformation Macro

    Posted Wed March 12, 2025 01:12 PM
    The tools site is not controlled by IBM.  I looked at the macro page sometime last  year and sent a request to the maintainer to correct a number of errors on it, but I never heard back.  The macro documentation in the CSR leaves a lot to be desired.  Several years ago, I tried to persuade some heavy macro users to write a book or at least an article expanding on that doc, but no one did.

    I really dislike the macro language and use it only to make a simple global symbol or a list of, for example, variable names.   Anything beyond that is Python territory IMO.  There is, however, the Python extension command SPSSINC SELECT VARIABLES that generates a macro listing variables selected according to properties such as variable type, patterns in the names, and more.


    --





  • 14.  RE: Transformation Macro

    Posted Thu March 13, 2025 11:05 AM

    This site has some nice gems. at this point, I don't know if the site is alive or even if Raynald Levesque is even alive. another resource is here. although not as extensive as Rynalds, it has some solid basics.



    ------------------------------
    Meni Berger
    ------------------------------



  • 15.  RE: Transformation Macro

    Posted Thu March 13, 2025 06:55 AM

    Thank you @Jon Peck. This actually looks like a nice add-on. Alas, this is a military network with scarce instances of Python. Although the IDF excels in pounding the enemy to dust, it leaves much to be desired in computing and software implementation. 

    Thank you for the PDF. I am a bit concerned about embedding it into my 'AnythingLLM' and merging it into my SPSS AI Agent. when used as a statement, will print produce an error? 

    It's a real shame that IBM is not investing in integrating SPSS with AI and maintaining other manuals and training material. there is still a large community that can benefit from it.



    ------------------------------
    Meni Berger
    ------------------------------



  • 16.  RE: Transformation Macro

    Posted Thu March 13, 2025 09:25 AM
    SPSS installs its own copy of Python when you install it, so no other installation is required.  The STATS PREPROCESS extension would have to be specifically installed unless it is part of the automatically installed extensions.

    I do agree that SPSS needs to get on the AI bandwagon, but I do not know what might be planned in that area.


    --





  • 17.  RE: Transformation Macro

    Posted Thu March 13, 2025 11:22 AM

    Importing files (of any kind) might prove challenging. this is the reason we are entangled in this Macro situation to begin with.

    on the other hand, if I can run Python code without installing Python instances is great news!



    ------------------------------
    Meni Berger
    ------------------------------



  • 18.  RE: Transformation Macro

    Posted Thu March 13, 2025 11:25 AM
    Yes, if Statistics is installed, Python is available, at least for the last several releases.


    --