Watson Studio

 View Only

Modeler Python scripting – bringing a whole different level of customization to your Modeler streams

By Ted Fischer posted Wed January 27, 2016 05:13 PM

The core principle of IBM SPSS Modeler has long been being able to do complex data analysis and sophisticated model building all without programming. Still, there are many times when users want a way to run Modeler streams without having to click buttons or move nodes. As a result, Modeler has had for a long time a scripting capability. Scripting is needed when running a stream in a fully automated way such as through Modeler Batch or Collaboration and Deployment Services (C&DS). Even when running a Modeler client by itself, a script can be useful to ensure multiple steps can occur in a certain order. For instance, a model building process may take many hours and needs to run overnight. However, if the process ends in the middle of the night, there would be considerable time savings if the model evaluation process can then immediately kick off. This is possible though scripting.


Through Modeler 15, the scripting language was unique to IBM SPSS Modeler and it had significant limitations – for instance it did not allow for programming loops of indefinite duration. In response to many requests,. IBM introduced Python scripting in Modeler 16. In the product the scripting is labeled Python but it is actually Jython – an implementation of Python integrated with mixture of Python and Java. JPython was chosen because as the Modeler GUI uses Java, so Java references are needed to work with Modeler objects.

The scripting is a full instance of Jython so in fact one can run a program that does not refer to Modeler nodes at all. Scripting can also be used to – or reference or include other programs outside of Modeler. One note though is that the scripting only runs on Modeler Client and not Modeler Server. Thus, we do not recommend using Python scripting for complex data analysis or other intensive computations unless you know that the computer can handle this (In the case of C&DS jobs, the C&DS server acts as the client for the purpose of scripting).

Python scripting does require looking at the documentation to know how to reference nodes (Note: Even if you are using Modeler 16 you should look at the Modeler 17 or higher documentation as it has many enhancements in this area). In addition there are a couple of really good explanatory published articles – on Paul Brown’s blog and in Julian Clinton’s developerworks article. In my next post, I am going to provide a different example which relates to my prior post on SQL optimization.