Original Message:
Sent: 3/11/2025 12:42:00 PM
From: Bruce Weaver
Subject: RE: Transformation Macro
Hello @Meni Berger. I understand that you want more compact code. But I first wanted to make sure that we all understood what you want the basic code at the core of the macro to be doing. Given your response, I assume my (non-macrotized) code did that. ;-)
Second, OLS linear regression does not assume or require that any of the variables (explanatory or outcome) be normally distributed. This is a myth. The necessary normality assumption is that the sampling distributions of the parameter estimates (i.e., the coefficients) are approximately normal. And a sufficient normality assumption is that the errors are approximately normally distributed. I've attached a small set of slides that might be helpful.
Third, in many contexts, use of "stepwise" selection is ill-advised. Here are a couple of relevant resources:
I hope this helps.
EDIT: I could not see the PDF I attached, so I've uploaded it to ResearchGate. You can view it here.
------------------------------
Bruce Weaver
------------------------------
Original Message:
Sent: Tue March 11, 2025 11:41 AM
From: Meni Berger
Subject: Transformation Macro
Hello, @Bruce Weaver
Thank you for the code, although I am looking for a more elegant and compact solution, which involves the looping through values. for example !DO 0.25 !TO 4 !BY 0.25.
Regarding the XY issue, the variables are to be used as predictors in OLS Stepwise Regression and as a general template for other procedures that require normalizing variables.
------------------------------
Meni Berger
Original Message:
Sent: Tue March 11, 2025 09:50 AM
From: Bruce Weaver
Subject: Transformation Macro
Hello Meni. You wrote: "Anyway, I have a set of skewed input variables that need to be normalized for modeling."
What kind of model(s) are you talking about, and what role do the variables you wish to transform play in those models? I ask because very often, people may think that a model requires some variable(s) to be normally distributed when in fact there is no such assumption. That makes me wonder if your initial question is an example of the so-called XY problem.
Meanwhile, following up on what @David Dwyer asked, I think it would be helpful to start with some ordinary code (no macros) that carries out the transformations you want, and then convert it into macro code. With that in mind, does the following code do the transformations you want?
NEW FILE.DATASET CLOSE ALL.OUTPUT CLOSE ALL.DATA LIST FREE / X (F1).BEGIN DATA1 2 3 4 5 6 7END DATA.* COMPUTE !VAR (!PRED**!I-1)/!I.COMPUTE Res0.25 = (X**0.25-1)/0.25.COMPUTE Res0.50 = (X**0.50-1)/0.50.COMPUTE Res0.75 = (X**0.75-1)/0.75.COMPUTE Res1.00 = (X**1.00-1)/1.00.COMPUTE Res1.25 = (X**1.25-1)/1.25.COMPUTE Res1.50 = (X**1.50-1)/1.50.COMPUTE Res1.75 = (X**1.75-1)/1.75.COMPUTE Res2.00 = (X**2.00-1)/2.00.COMPUTE Res2.25 = (X**2.25-1)/2.25.COMPUTE Res2.50 = (X**2.50-1)/2.50.COMPUTE Res2.75 = (X**2.75-1)/2.75.COMPUTE Res3.00 = (X**3.00-1)/3.00.COMPUTE Res3.25 = (X**3.25-1)/2.25.COMPUTE Res3.50 = (X**3.50-1)/2.50.COMPUTE Res3.75 = (X**3.75-1)/2.75.COMPUTE Res4.00 = (X**4.00-1)/3.00.FORMATS Res0.25 TO Res4.00 (F5.2). LIST.
Cheers,
Bruce
------------------------------
Bruce Weaver
Original Message:
Sent: Tue March 11, 2025 08:31 AM
From: Meni Berger
Subject: Transformation Macro
I am working for a client that deploys SPSS on a military network. Introducing Python to his environment was very difficult, so he decided to use whatever he had in his toolbox, meaning Macro language. I am trying to convince him to make the transition to Python. I hope to succeed in that feat.
Speaking of Python- there are even fewer examples of Python implementation in SPSS than there are of Macro.
Anyway, I have a set of skewed input variables that need to be normalized for modeling.
For each variable, a set of Box-Cox transformations is applied. I decided to set Lambda to 0.25 increments, starting with 0.25 up to 4.
That will produce a set of 16 new transformed variables for each original skewed input variable.
For each transformed set, I will perform a Frequency procedure (/STATISTICS=SKEWNESS SESKEW KURTOSIS SEKURT) to see which of the Lambda values yields the best SKEWNESS and KURTOSIS (closest to normal).
After that, I will manually select that desired transformed variable and delete the rest of the transformed series.
Original Message:
Sent: 3/10/2025 3:02:00 PM
From: David Dwyer
Subject: RE: Transformation Macro
Hi @Meni Berger
It's true. The SPSS macro language has not had significant (if any) development since about SPSS 6.1.4 for Windows. When we introduced Python Programmability in SPSS 14.0, the intent was for users to ultimately migrate themselves and their work to Python. The main strengths of the macro language lies in substitution and iteration to build legitimate SPSS command syntax and execute it. Hopefully you agree, Python is more versatile in this regard and add functionality as well.
Whichever way you go implementing your approach, can you articulate what it is you are trying to do?
- What would the dataset look like before your manipulations?
- What would it look like after your manipulations?
- What hard rules exist that you might use in SPSS Command Syntax to cary out the manipulations -- in short, if you didn't have a computer to do this for you, how would you go about doing it manually?
------------------------------
David Dwyer
Global Solution Consultant
IBM
Original Message:
Sent: Mon March 10, 2025 10:29 AM
From: Meni Berger
Subject: Transformation Macro
Thank you for your code offerings @David Dwyer and @Suman Suhag.
Unfortunately, none of the suggestions you wrote to me are functioning. I've tried @David Dwyer suggestions to manage the logical part of the commands over a regular Syntax, with no success.
I've also tried using AI (ChatGpt and Claude) to generate this Macro code. Surprisingly, it failed, too, no matter how tenacious I was.
I could not find any viable example of such a Macro feat on the web. I was also surprised by how few Macro code examples are available on the web. looks like at some point Macro became deprecated and that's why AI is having such an issue figuring it out and rendering a working code.
------------------------------
Meni Berger
Original Message:
Sent: Wed March 05, 2025 02:17 PM
From: David Dwyer
Subject: Transformation Macro
Hi @Meni Berger,
You have syntax errors in your macro definition code. For example: !TOKENS(num) is the macro function. You have !TOKEN(num) in several places. You're missing a closing parenthesis as well. This is all in the first line.
DEFINE BCT (PRED !TOKENS(1) / RESULT !TOKEN(1) / FIRST !TOKEN(1) / LAST !TOKEN(1)
should maybe be
DEFINE BCT (PRED !TOKENS(1) / RESULT !TOKENS(1) / FIRST !TOKENS(1) / LAST !TOKENS(1) )
But is the macro language really even the way you want to go with this?. Note that the SPSS Statistics macro language is for manipulating SPSS Statistics command syntax. Thus, at its heart it is a language for manipulating strings. Those strings, if you do it right, turn out to be legitimate and valid SPSS Statistics command syntax commands.
If you are committed to using the macro language to accomplish your goal, perhaps start with a sample dataset and then write the command syntax for the manipulations you want to do. Once you have command syntax that does what you want, then look for ways to condense and iterate it via the macro language.
I've taken your example macro and made it syntactically correct (if not logically correct).
DATASET CLOSE ALL.
NEW FILE.
OUTPUT CLOSE ALL.
DEFINE @BCT (PRED !TOKENS(3) / RESULT !TOKENS(1) / FIRST !TOKENS(1) / LAST !TOKENS(1))
!DO !I = !FIRST !TO !LAST !BY 0.25
!LET !VAR= !CONCAT (!RESULT, !I)
COMPUTE !VAR (!PRED**!I-1)/!I.
EXECUTE.
!DOEND.
!ENDDEFINE.
SET PRINTBACK ON MPRINT ON.
@BCT PRED= PREDICOR1 PREDICOR2 PREDICOR3 RESULT=RES FIRST=0.25 LAST=4.
SET MPRINT OFF.
Note the SET MPRINT ON. diagnostic statement. This setting will show you the command syntax being generated by the macro. For troubleshooting, I strongly recommend using SET MPRINT ON. and SET MPRINT OFF. around your macro call so that you can see exactly what your macro is doing at every step.
For more details on using the macro language, see the SPSS Statistics Command Syntax Reference Guide in the DEFINE--!ENDDEFINE section.
Ultimately you may decide that your particular problems needs a completely different approach. If you are creating data from known starting points, perhaps you want to use INPUT PROGRAM. -- END INPUT PROGRAM. instead.
Perhaps you might want to just write the core command syntax and then iterate it with Python .
Begin with the literal command syntax that does what you want first. Then move on from there.
------------------------------
David Dwyer
Global Solution Consultant
IBM
Original Message:
Sent: Wed March 05, 2025 08:22 AM
From: Meni Berger
Subject: Transformation Macro
Hello, data people!
I need to apply transformations to a series of variables, in 0.25 increments (0.25 to 4).
The idea is to take a series of variables and 'COMPUTE' them to a set of new variables that are transformed with the same basic calculation that is increased by 0.25.
I tried:
DEFINE BCT (PRED !TOKENS(1) / RESULT !TOKEN(1) / FIRST !TOKEN(1) / LAST !TOKEN(1)
!DO !I = !FIRST !TO !LAST !BY 0.25
!LET !VAR= !CONCAT (!RESULT, !I)
COMPUTE !VARE(!PRED**!I-1)/!I.
EXECUTE.
!DOEND.
!ENDDEFINE
BCT PRED= PREDICOR1 PREDICOR2 PREDICOR3 RESULT=RES FIRST=0.25 LAST=4
I get multiple errors stating that "The DEFINE command includes an invalid keyword specification.
Then I tried AI to suggest a macro with no real results:
DEFINE BCT (PRED = !CHAREND('/')
/RESULT = !CHAREND('/')
/FIRST = !CHAREND('/')
/LAST = !CHAREND('/'))
!LET !PREDCOUNT = 3
!DO !J = 1 !TO !PREDCOUNT
!LET !CURRENTPRED = !UNQUOTE(!EXTRACT(!J, !PRED))
!DO !I = !UNQUOTE(!FIRST) !TO !UNQUOTE(!LAST) !BY 0.25
!LET !VAR = !CONCAT(!RESULT, !CURRENTPRED, "_No", !I)
COMPUTE !VAR = (!CURRENTPRED**!I - 1) / !I.
!DOEND
!DOEND
EXECUTE.
!ENDDEFINE.
Does anyone has an idea what's wrong with my macro?