SPSS Statistics

 View Only

How Can I Parameterize GGRAPH and GPL code?

By Archive User posted Fri August 19, 2011 04:50 PM

  


 Users of traditional SPSS Statistics syntax are used to using the macro facility to parameterize blocks of syntax so that it is more flexible and can be varied without having to duplicate and edit the code.  However the GGRAPH command, which provides deep access to the capabilities of the graphics engine, specifies the graph using GPL, the graphics specification language of SPSS.  And GPL does not work with macro.  How, then, can GPL code be parameterized?  This post explains how to do this and, in the process, how to build a library of your own graphics specifications that removes the chart definition details from your syntax stream.





First, let's look at the syntax for a bar chart as generated by the Chart Builder.  These examples use the employee data.sav file shipped with the software.

image




  • The GGRAPH command is a standard Statistics command and follows all the normal rules for syntax.  Macro can be used with it.


  • The GPL block contains the actual chart specifications as indicted by the GGRAPH GRAPHSPEC subcommand.  As you can see, it looks different from traditional syntax, and it follows different rules.  GPL syntax is explained in detail in the Help under the GPL topic.  You can do many things with it, but using macro is not one of them.


  • This syntax is completely specific to the specification for this chart.  To change the title, say, would require manual editing of the GUIDE statement with text.title (or generating a new command with the Chart Builder).  Not very good for production work.





If you can't use macro to generalize this, what can you do?  I'll show you how to use Python programmability not only to replace macro but to build a library of chart definitions that can be shared among different syntax streams.  First, lets see how we could parameterize the title of the chart.  (Real problems will want to do more, but the idea is the same.)  Here is the first version.

image



 




  • The entire GGRAPH command and the GPL code are assigned to the variable cmd inside the BEGIN PROGRAM block.  The text of the command is the same as before except for the title line.  In that line, in place of the title, we have the notation

    %(thetitle)s

    That means to insert the value of the variable thetitle there.  It's just like macro substitution here except that it works!  (I also added a COMMENT line to the GPL.)  The substitution is triggered by the notation above, and the values to substitute come from the

    % locals()

    at the end.


  • The value of thetitle is set at the top of the program block.  The value is enclosed in triple quotes, so it could be multiple lines or text that contained quote characters.


  • The last line of this program uses the spss.Submit function to run the command whose syntax is in cmd.



Using this mechanism, we have generalized the command to allow for any title text.  A real problem would usually have more than one substitution parameter, but the logic is the same.  Refer to the parameter by name in the appropriate part of the GPL and assign a name at the top.  You might also need a little code to parameterize the axis labeling based on variable labels.  That's easy to do, but I won't explain that here.



This mechanism requires that you install the Python Essentials available (for free) from this site.



So now we have solved the problem of generalizing the GPL, but having generalized this command, we might want to use it in other job streams.  Duplicating the code is always a bad idea.  Python lets us remove the code from the job stream and just refer to it.  It's something like the Statistics INSERT command, but it is more flexible.



Here is the third version of the code where the GGRAPH and GPL code has been removed from the job stream.

image



 



 




  • Now in the program code, we import a library named chartlib and then call a function in that library passing in the title.  chartlib could contain many functions that define different sorts of charts (or do other things).  Now improvements can be made once in chartlib and used by all the job streams that import it.


  • The import statement did not say where to find chartlib.  Python has an elaborate strategy for finding imported modules.  Refer to the Python documentation for the full story, but for now, we will just put chartlib.py in the extensions subdirectory of the SPSS Statistics installation.  Python will find it there.



What remains is to see what the chartlib module looks like.  Here it is.  It looks almost identical to our first parameterized version, except that the chart code has been moved inside a function named mybarchart that has one parameter for the title.  Everything is indented under the function declaration, which starts with def.  The line after def is the docstring, which should be used to document the function.

image

By putting this code inside a function, we open the door to defining many functions in this same module and selecting the one we want in the Statistics syntax stream, passing in any desired parameters.



Summarizing, by parameterizing the GPL code and moving it into a function in our library module, we have generalized the code and made it easy to maintain and share across different job streams.  Although this posting is motivated by the need to parameterize GPL, these techniques can be used with any Statistics code.



There is one more thing we could do to completely hide the Python code in the job stream.  We could use the SPSSINC PROGRAM extension command to provide standard Statistics code for passing the parameters and invoking the relevant function.  I'll leave that for another time, but you can get that extension command from this site and read about it in the module you download.



 







#extensions
#graphics
#SPSSStatistics
0 comments
1 view

Permalink