IBM SPSS Statistics provides several mechanisms for looping in transformations or over groups within a procedure. The new extension commands expand the looping capabilities.
Using the standard capabilities, you can create loops in several ways.
Transformations can contain general loops using LOOP, and you can loop over variables with DO REPEAT. Since transformations implicitly run within the case processing loop, you cannot include procedures within these loops.
SPLIT FILES lets procedures iterate over contiguous subgroups of the case data. The procedure's Viewer output either combines the output for all the groups into a single table or produces a set of separate tables and charts organized by group.
Python programmability allows Pythonistas to iterate over collections of files or variables, among other things. It is very general but requires Python knowledge.
What none of these methods allows you to do is to apply a set of commands to a collection of inputs and organize the entire set of output by group using regular SPSS syntax. For example, you might want to process a set of data files, run several transformations and procedures, and save all the output for each input file to a separate document. You might also want to save all the transformed datasets. These new commands addresses problems like this.
In order to generalize the split file idea, you first use SPSSINC SPLIT DATASET to make a directory of datasets, one per split value. The native way to do such an operation is to use the XSAVE transformation command with appropriate DO IF conditions for each group. This works well, but it has three problems. First, you have to have an exhaustive list of all the split values. Second, you have to write a lot of code. Third, the number of XSAVE commands that you can use in a single transformation block is limited. Prior to version 18, the limit was ten. For newer versions the limit is 64. So you have to count up and divide your code into separate blocks. Once you have all this working, if a new split value appears, you have to revise the code - if you notice the new value.
SPSSINC SPLIT DATASET eliminates these problems. It figures out what split values occur and generates the requisite syntax, taking into account the XSAVE limit. It lets you choose whether to name the outputs by variable values, labels, or sequential numbers, and it can produce a listing of the files created for use in later processing. No risk of unnoticed new values. And the data do not need to be sorted by the split values.
SPSSINC PROCESS FILES addresses the other side of the problem. It accepts an input specification that could be something like a file wildcard, e.g., /mydata/*.sav, or a file that lists the files to process such as produced by SPSSINC SPLIT DATASET. Then it applies the contents of a syntax file to each input, i.e., it loops over the inputs. It defines file handles and macros representing the input and output parameters for each file processed, so you can refer to these explicitly in the iterated syntax. It can write an individual Viewer output file for each input file, or it can produce a single Viewer file with all the output. These automatic files get names based on the input file names, but, of course, you can do other things in the syntax file. It can also produce a log listing all the actions taken and whether any serious errors occurred.
SPSSINC PROCESS FILES is not limited to SAV files or even data. It's up to you what you want to loop over.
SPSSINC PROCESS FILES solves the long-standing request for a way to put procedures inside loops. With these new tools, you can now easily create general transformation loops, loops over variables, procedure loops over groups, or entire job loops over arbitrary inputs whether or not you are a Python person.
These commands can be downloaded from SPSS Developer Central (www.spss.com/devcentral). They require at least IBM SPSS Statistics Version 17 and the Python programmability plug-in.
p. s. Both of these commands have dialog box interfaces as well as standard SPSS-style syntax.
I hope you will find these useful.
#extensions#Programmability#python#SPSSStatistics