IBM SPSS Statistics Version 18 introduced a new variable property: role. The role can be Input, Target, Both, None, Split, or Partition. This new metadata comes from IBM SPSS Modeler and is useful in abstracting and generalizing jobs.
Roles are normally set by the user. Currently, these simply make initial settings in some dialog boxes. But if the roles are set correctly, it becomes possible to automate and raise the level of abstraction of repetitive tasks. For example, you might need to produce a standard set of analyses/reports across a variety of datasets that have a similar structure but vary in the exact variables they contain or other details. By abstracting the logic of a job to use roles, measurement levels, custom attributes and other variable properties, you can reduce the number of versions of a job that need to be developed and maintained. This can save time and reduce the number of errors.
I have seen customer sites where there are huge numbers of job files - syntax, templates, macros, scripts, etc - that are very similar but duplicated and modified, because the variables coming in are a little different or the coding of variables is a little different. Once you build a big set of jobs like this, making improvements or bug fixes becomes a nightmare, not to mention the extra time it takes to do things this way.
The long-standing macro facility provides some possibilities for abstraction, but it is static and can't use the metadata in a dataset. In contrast, the SPSSINC SELECT VARIABLES command allows you to define sets of variables based on the metadata rather than just a hard-coded list of names. It can use explicit names, patterns in names (all the variable names that contain AGE), measurement level, type (numeric vs string), custom attributes, and, finally, role, to define sets of variables that can be used in the job. These sets are embodied in, yes, macros. Of course, you could write your own code to use this sort of information, but SELECT VARIABLES can do a lot of this without the need to learn programmability. And it has a dialog box interface as shown here.
image
For example, suppose you have a mostly standard questionnaire that is used in many studies, but it has a few custom questions that vary from study to study, or some variables are sometimes omitted. You need to produce tabulations and estimate similar models for these studies. By intelligent use of the metadata, including role, you can perhaps have one master job rather than dozens. This leaves the analyst or researcher free to focus on the brain work part of the job rather than the tedious mechanical and error prone parts. If you have a data supplier who collects and prepares your datasets, you can instruct them on what roles and custom attributes should be defined. Then your analysis syntax can at least in part be based on these properties.
Custom attributes, first introduced in SPSS version 14 can also hold metadata such as questionnaire text, interviewer instructions, measurement units, or anything else that is useful in documenting your data or in programmatically manipulating it. In syntax, these can be created with the VARIABLE ATTRIBUTE command. (There is also a DATAFILE ATTRIBUTE command.) Roles can be defined with the VARIABLE ROLE command. Attributes and Roles can also be defined in the Data Editor or the Define Variable Properties dialog. They all persist with the saved data. These can be used in Modeler, too.
In summary, it's all about generalization and automation. Role is just one more attribute that can be used in this effort.
SPSSINC SELECT VARIABLES can be obtained from the SPSS Community and requires the Python Programmability plugin/essentials.
#Programmability#python#SPSSStatistics