SPSS Statistics

SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers! 

 View Only
Expand all | Collapse all

Meaning / Definition / Description of Summary Statistics (Custom Tables)

  • 1.  Meaning / Definition / Description of Summary Statistics (Custom Tables)

    Posted Mon January 30, 2023 09:58 AM
    Hi all,

    Is there an explanation for the various summary statistics and how they are calculated? I checked the support documentation for Summary Statistics for Categorical Variables - IBM Documentation but besides the highlevel groups like Count, Row, Column there isn't any explanation as to how the stat is calculated or what its reporting. 

    For example "Row Total N %" for a variable  

      Count Row Total N % Row N %
    Do you like apples ?  No 4 100.0% 100.0%
    Yes 15 100.0% 100.0%

    but if I change the Category Position for the custom table without making any other changes. I get a different "Row Total N %" and I'm not sure how it was calculated and how it differs from the "Row N %"

    No Yes
    Count Row Total N % Row N % Count Row Total N % Row N %
    Do you like apples ?  4 16.7% 21.1% 15 62.5% 78.9%

    Is  anyone able to point me in the right direction?
    Thanks
    MIchael

    ------------------------------
    Michael
    ------------------------------


  • 2.  RE: Meaning / Definition / Description of Summary Statistics (Custom Tables)

    Posted Mon January 30, 2023 12:21 PM
    Hi. No, there is nothing in the algorithms document, but the manual does provide some direction.

    I tried to reproduce your example:
    data list free /like_apples n.
    begin data.
    1 4
    2 15
    3 5
    end data.
    weight by n.
    missing values like_apples (3).
    value labels like_apples 1 "No" 2 "Yes" 3 "Missing".

    CTABLES
    /VLABELS VARIABLES=like_apples DISPLAY=LABEL
    /TABLE like_apples [COUNT F40.0, ROWPCT.COUNT PCT40.1, ROWPCT.TOTALN PCT40.1]
    /CATEGORIES VARIABLES=like_apples ORDER=A KEY=VALUE EMPTY=EXCLUDE
    /CRITERIA CILEVEL=95.

    CTABLES
    /VLABELS VARIABLES=like_apples DISPLAY=LABEL
    /TABLE like_apples [COUNT F40.0, ROWPCT.COUNT PCT40.1, ROWPCT.TOTALN PCT40.1]
    /CLABELS ROWLABELS=OPPOSITE
    /CATEGORIES VARIABLES=like_apples ORDER=A KEY=VALUE EMPTY=EXCLUDE
    /CRITERIA CILEVEL=95.

    The Command Syntax Reference tells you what the statistics are:
    ROWPCT.COUNT: Row percentage based on cell counts. Computed within subtable.
    ROWPCT.TOTALN: Row percentage based on total count, including user-missing and system-missing values.

    So, if you have missing data for this variable (which you do), statistics will be based on different N's.

    The difference between the two syntax commands is: /CLABELS ROWLABELS=OPPOSITE.
    You can find that subcommand in the CSR, too; it breaks the row categories into separate columns.
    So given these things, 15/19 = 78.9% and 15/24 (including missing values in the denominator) = 62.5.

    ------------------------------
    Rick Marcantonio
    Quality Assurance
    IBM
    ------------------------------