SPSS Statistics

SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers! 

 View Only
  • 1.  What is the purpose of OUTPUT CREATE command?

    Posted Thu June 09, 2022 05:35 AM
    OUTPUT CREATE is a very fresh innovation available since SPSS 28. In my opinion the documentation is rather laconic and describes only basics. However, I think it may be a very useful feature and that's why I'm asking for more details. So...:

    1. What was the main reason to introduce this command - what's the point of using it?
    2. What are the advantages over syntax commands and Python/R features?
    3. Is there any PDF explaining this command and its usage? It would be very useful - especially when full StatJSON schema was included.
    4. Will more examples of creating JSON output be provided? An example from online documentation shows only how to make a simple graph. How to make a table?
    I know my question is a bit general so any help would be great.

    ------------------------------
    Konrad Gałuszko
    ------------------------------

    #SPSSStatistics


  • 2.  RE: What is the purpose of OUTPUT CREATE command?

    Posted Thu June 09, 2022 10:24 AM
    Hi Konrad,

    We created this StatJSON format for internal use last year (for the Kernel Ridge Regression procedure), and we thought it would be great to expose it to users, so we added this command.  Eventually we want to expand this to make it even easier for users to use.  Right now it is still in in its infancy stage.

    I agree we need to add to the documentation.  Currently all we have is the schema definition.

    We created a more detailed example syntax file that we hope will be shipped with Statistics version 29.0.0.0.  I attached the sample file below for reference.  Note the table requires the Sample file `Employee Data.sav` to be open to work properly.


    * Encoding: UTF-8.
    * Sample OUTPUT CREATE command.
    * Open sample file "Employee Data.sav" before running for best results.
    OUTPUT CREATE
        /SPEC SOURCE=INLINE.
    BEGIN STATJSON
    {
      "procedure": {
        "items": [
          {
            "table": {
              "caption": {
                "footnote_refs": [
                  1
                ], 
                "value": "This table is the first table created"
              }, 
              "cells": [
                [
                  1, 
                  0.5, 
                  1, 
                  {
                    "subscripts": "a,c,e", 
                    "value": 3.0
                  }, 
                  0.857
                ], 
                [
                  1.1, 
                  1.5, 
                  1, 
                  3, 
                  0.854
                ], 
                [
                  1.2, 
                  0.5, 
                  1, 
                  3, 
                  0.852
                ], 
                [
                  0.9, 
                  0.5, 
                  1, 
                  3, 
                  0.848
                ], 
                [
                  0.8, 
                  0.5, 
                  1, 
                  3, 
                  0.844
                ], 
                [
                  1, 
                  0.5, 
                  null, 
                  null, 
                  0.835
                ], 
                [
                  0.9, 
                  0.5, 
                  null, 
                  null, 
                  0.833
                ], 
                [
                  1.1, 
                  0.5, 
                  null, 
                  null, 
                  0.831
                ], 
                [
                  0.7, 
                  0.5, 
                  1, 
                  3, 
                  0.829
                ]
              ], 
              "corner": {
                "value": "Sample corner text"
              }, 
              "default_cell_format": {
                "decimals": 3, 
                "type": "F"
              }, 
              "dimensions": {
                "columns": [
                  {
                    "descendants": [
                      "Alpha", 
                      "Gamma", 
                      "Coef0", 
                      {
                        "default_cell_format": {
                          "decimals": 0
                        }, 
                        "value": "Degree"
                      }, 
                      "Mean Test Subset R Square"
                    ], 
                    "show_dimension_categories": true, 
                    "show_label": false, 
                    "value": "Statistics"
                  }
                ], 
                "rows": [
                  {
                    "descendants": [
                      "Polynomial", 
                      "Polynomial", 
                      "Polynomial", 
                      "Polynomial", 
                      "Polynomial", 
                      {
                        "descendants": [
                          "RBF", 
                          "RBF", 
                          "RBF"
                        ], 
                        "value": "RBF Group"
                      }, 
                      "Polynomial"
                    ], 
                    "value": "Kernal"
                  }
                ]
              }, 
              "footnotes": [
                "Dependent Variable: y", 
                "Model: x1, x2", 
                "Number of crossvalidation folds: 5"
              ], 
              "hide_title": false, 
              "max_data_column_width": 300, 
              "min_data_column_width": 200, 
              "name": "Model Comparisons", 
              "title": {
                "footnote_refs": [
                  0, 
                  1, 
                  2
                ], 
                "value": 2, 
                "variable": "jobcat"
              }
            }
          }, 
          {
            "text": {
              "content": "Some text content", 
              "name": "A Text Title for the Nav Pane"
            }
          }, 
          {
            "warning": {
              "text": "This is an example of a warning message."
            }
          }, 
          {
            "notes": [
              {
                "cell_value": "This is an example of a python error message.", 
                "row_header": "Python Errors"
              }
            ]
          }, 
          {
            "graph": {
              "X": {
                "data": [
                  0, 
                  1, 
                  2, 
                  3, 
                  4, 
                  5, 
                  6, 
                  7, 
                  8, 
                  9, 
                  10, 
                  11, 
                  12, 
                  13, 
                  14, 
                  15
                ], 
                "label": "y"
              }, 
              "Y": {
                "data": [
                  0, 
                  1, 
                  4, 
                  9, 
                  16, 
                  25, 
                  49, 
                  64, 
                  81, 
                  100, 
                  121, 
                  144, 
                  169, 
                  196, 
                  225
                ], 
                "label": "Predicted Value"
              }, 
              "name": "Scatterplot of y by Predicted Value", 
              "title": "Scatterplot of y by Predicted Value", 
              "type": "Scatterplot"
            }
          }, 
          {
            "graph": {
              "X": {
                "data": [
                  1, 
                  2, 
                  3, 
                  4
                ], 
                "label": "Some Values"
              }, 
              "Y": {
                "data": [
                  100, 
                  200, 
                  50, 
                  400
                ], 
                "label": "Other Values"
              }, 
              "name": "Area Test", 
              "templates": [
                "/Applications/IBM SPSS Statistics/Resources/Looks/Mellow.sgt"
              ], 
              "title": "Area of values", 
              "type": "Area"
            }
          }, 
          {
            "graph": {
              "X": {
                "data": [
                  "male", 
                  "female", 
                  "male", 
                  "female"
                ], 
                "label": "Gender"
              }, 
              "Y": {
                "data": [
                  100, 
                  200, 
                  150, 
                  250
                ], 
                "label": "Count"
              }, 
              "name": "Counts by Gender", 
              "position_modifier": "stack", 
              "split": {
                "data": [
                  "red", 
                  "red", 
                  "red", 
                  "green"
                ], 
                "label": "Color"
              }, 
              "title": "Counts by Gender", 
              "type": "Bar"
            }
          }, 
          {
            "graph": {
              "X": {
                "data": [
                  1, 
                  2, 
                  3, 
                  4, 
                  5
                ], 
                "label": "Some Values"
              }, 
              "Y": {
                "data": [
                  100, 
                  200, 
                  50, 
                  400, 
                  500
                ], 
                "label": "Other Values"
              }, 
              "name": "Line Test", 
              "title": "Line of values", 
              "type": "Line"
            }
          }, 
          {
            "graph": {
              "X": {
                "data": [
                  "male", 
                  "female"
                ]
              }, 
              "Y": {
                "data": [
                  100, 
                  200
                ]
              }, 
              "name": "Pie Test", 
              "title": "Pie of values", 
              "type": "Pie"
            }
          }, 
          {
            "graph": {
              "X": {
                "data": [
                  1, 
                  1, 
                  1, 
                  1, 
                  1, 
                  1, 
                  1, 
                  1, 
                  2, 
                  2, 
                  2, 
                  3, 
                  3, 
                  3, 
                  4, 
                  4, 
                  4, 
                  4, 
                  4, 
                  4, 
                  4, 
                  4, 
                  4
                ], 
                "label": "Salary"
              }, 
              "name": "Histogram Test", 
              "title": "Histogram", 
              "type": "Histogram"
            }
          }, 
          {
            "graph": {
              "X": {
                "data": [
                  "male", 
                  "female", 
                  "female", 
                  "female", 
                  "female", 
                  "male", 
                  "male", 
                  "female", 
                  "male", 
                  "male"
                ], 
                "label": "Gender"
              }, 
              "Y": {
                "data": [
                  100, 
                  200, 
                  150, 
                  20, 
                  220, 
                  111, 
                  156, 
                  76, 
                  120, 
                  210
                ], 
                "label": "Count"
              }, 
              "name": "Boxplot Test", 
              "title": "Boxplot", 
              "type": "Boxplot"
            }
          }, 
          {
            "gpl_graph": {
              "data": [
                {
                  "source_name": "graphdataset", 
                  "variable_data": {
                    "data": [
                      0, 
                      1, 
                      2, 
                      3, 
                      4, 
                      5, 
                      6, 
                      7, 
                      8, 
                      9, 
                      10, 
                      11, 
                      12, 
                      13, 
                      14, 
                      15
                    ]
                  }, 
                  "variable_name": "dotSource"
                }
              ], 
              "editable": false, 
              "gpl": [
                "SOURCE: s=userSource(id(\"graphdataset\"))", 
                "DATA: dotSource=col(source(s), name(\"dotSource\"))", 
                "COORD: rect(dim(1))", 
                "GUIDE: axis(dim(1), label(\"test data\"))", 
                "GUIDE: text.title(label(\"Simple Dot Plot of test data\"))", 
                "ELEMENT: point.dodge.asymmetric(position(bin.dot(dotSource)))"
              ], 
              "name": "Test of GPL generated chart"
            }
          }, 
          {
            "heading": {
              "items": [
                {
                  "image": {
                    "image_path": "/Applications/IBM SPSS Statistics/Resources/R/lib/R/doc/manual/images/QQ.png", 
                    "name": "Test Image", 
                    "type": "png"
                  }
                }
              ], 
              "label": "My Heading"
            }
          }
        ], 
        "name": "My Custom Output"
      }
    }
    END STATJSON.
    ​


    ------------------------------
    LOUIS Kittock
    ------------------------------



  • 3.  RE: What is the purpose of OUTPUT CREATE command?

    Posted Thu June 09, 2022 01:56 PM
      |   view attached
    Oops, sorry the example above doesn't work in 28.0.1.1.
    Attached is an example that should work.

    Note that this isn't much different from creating output with Python or R, it just gives you another option.

    ------------------------------
    LOUIS Kittock
    ------------------------------

    Attachment(s)

    zip
    OutputCreate28.sps.zip   2 KB 1 version


  • 4.  RE: What is the purpose of OUTPUT CREATE command?

    Posted Fri June 10, 2022 09:07 AM
    I will also note that StatJSON 2.0 was originally conceived of to provide a (much) easier and better performing way for extension authors to generate output from extensions. When we realized how useful StatJSON 2.0 is, we decided to expose it more generally in the OUTPUT CREATE command.

    We actually have a new 'XTension' architecture that makes product extensions easier to create and faster to run, and the three new regression procedures in version 29 use the new architecture. Each of those new XTensions (Linear Ridge Regression, Linear Lasso Regression, and Linear Elastic Net Regression) internally create StatJSON 2.0 as the means to generate output.

    ------------------------------
    Curtis Browning
    SPSS Statistics Architect
    ------------------------------



  • 5.  RE: What is the purpose of OUTPUT CREATE command?

    Posted Tue June 14, 2022 06:59 AM
    Hi Louis,

    I tried your example and it always resulted in SPSS Processor's crash. I have version 28.0.1.0 (142).

    ------------------------------
    Konrad Gałuszko
    ------------------------------



  • 6.  RE: What is the purpose of OUTPUT CREATE command?

    Posted Tue June 14, 2022 02:08 PM

    I installed  28.0.1.0 (142) and ran the OutputCreate28.sps file with no issues on my Mac.  I'll see if I can get it to run on Windows.

    Perhaps skip the first syntax block that generates a table, and try the rest.  They are less likely to cause a crash.



    ------------------------------
    LOUIS Kittock
    ------------------------------