I've made a macro to estimate restricted cubic spline (RCS) basis in SPSS. Splines are useful tools to model non-linear relationships. Splines are useful exploratory tools to model non-linear relationships by transforming the independent variables in multiple regression equations. See Durrleman and Simon (1989) for a simple intro. I've largely based my implementation around the various advice Frank Harell has floating around the internet (see the rcspline
function in his HMisc R package), although I haven't read his book (yet!!).
So here is the SPSS MACRO, and below is an example of its implementation. It takes either an arbitrary number of knots, and places them at the default locations according to quantiles of x's. Or you can specify the exact locations of the knots. RCS need at least three knots, because they are restricted to be linear in the tails, and so will return k - 2 bases (where k is the number of knots). Below is an example of utilizing the default knot locations, and a subsequent plot of the 95% prediction intervals and predicted values superimposed on a scatterplot.
FILE HANDLE macroLoc /name = "D:TempRestricted_Cubic_Splines".
INSERT FILE = "macroLocMACRO_RCS.sps".
*Example of there use - data example taken from http://www-01.ibm.com/support/docview.wss?uid=swg21476694.
dataset close ALL.
output close ALL.
SET SEED = 2000000.
INPUT PROGRAM.
LOOP xa = 1 TO 35.
LOOP rep = 1 TO 3.
LEAVE xa.
END case.
END LOOP.
END LOOP.
END file.
END INPUT PROGRAM.
EXECUTE.
* EXAMPLE 1.
COMPUTE y1=3 + 3*xa + normal(2).
IF (xa gt 15) y1=y1 - 4*(xa-15).
IF (xa gt 25) y1=y1 + 2*(xa-25).
GRAPH
/SCATTERPLOT(BIVAR)=xa WITH y1.
*Make spline basis.
*set mprint on.
!rcs x = xa n = 4.
*Estimate regression equation.
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10) CIN(95)
/NOORIGIN
/DEPENDENT y1
/METHOD=ENTER xa /METHOD=ENTER splinex1 splinex2
/SAVE PRED ICIN .
formats y1 xa PRE_1 LICI_1 UICI_1 (F2.0).
*Now I can plot the observed, predicted, and the intervals.
GGRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=xa y1 PRE_1 LICI_1 UICI_1
/GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
DATA: xa=col(source(s), name("xa"))
DATA: y1=col(source(s), name("y1"))
DATA: PRE_1=col(source(s), name("PRE_1"))
DATA: LICI_1=col(source(s), name("LICI_1"))
DATA: UICI_1=col(source(s), name("UICI_1"))
GUIDE: axis(dim(1), label("xa"))
GUIDE: axis(dim(2), label("y1"))
ELEMENT: area.difference(position(region.spread.range(xa*(LICI_1+UICI_1))), color.interior(color.lightgrey), transparency.interior(transparency."0.5"))
ELEMENT: point(position(xa*y1))
ELEMENT: line(position(xa*PRE_1), color(color.red))
END GPL.
See the macro for an example of specifying the knot locations. I also placed functionality to estimate the basis by groups (for the default quantiles). My motivation was partly to replicate the nice functionality of ggplot2 to make smoothed regression estimates by groups. I don't know off-hand though if having different knot locations between groups is a good idea, so caveat emptor and all that jazz.
I presume this is still needed functionality in SPSS, but if this was not needed let me know in the comments. Other examples are floating around (see this technote and this Levesque example), but this is the first I've seen of implementing the restricted cubic splines.
#datavisualization#MACRO#SPSS#SPSSStatistics#Visualization