SPSS Statistics

 View Only
Expand all | Collapse all

Extracting specific values from SPSS Tables

  • 1.  Extracting specific values from SPSS Tables

    Posted 17 days ago

    Is there a way to extract specific values from SPSS tables to use in other calculations?  For example, when a Descriptives Table is called for in syntax, is there a way to extract a specific value from that table for use in subsequent syntax calculations, such as using values pulled from the IQR, skewness, or kurtosis? 



    ------------------------------
    Mark Sloan
    ------------------------------


  • 2.  RE: Extracting specific values from SPSS Tables

    IBM Champion
    Posted 17 days ago
    There are scripting apis that can pull values from any pivot table and define macros with those values that can be used anywhere in syntax.

    --





  • 3.  RE: Extracting specific values from SPSS Tables

    Posted 17 days ago
    Edited by Mark Sloan 17 days ago

    Thank you for your reply.  Are these external to SPSS or can be accomplished within SPSS?  As I am using (and learning) different programming scripts/languages, if possible, I'd like to extract specific values within SPSS syntax.  If not possible, which scripting apis would you recommend?



    ------------------------------
    Mark Sloan
    ------------------------------



  • 4.  RE: Extracting specific values from SPSS Tables

    IBM Champion
    Posted 17 days ago
    SPSS does still provide scripting apis in SaxBasic VBA, although that technology is pretty much obsolete.  It is documented in the SaxBasic script editor.  Everything those can do and much more are provided in Python apis

    In general, though, there are apis that can read cells from pivot tables and run syntax crafted via Python or SaxBasic code.  There is a reference document available via Help > Docs in PDF Format for all the Python apis.  There is a downloadable book at that same location that teaches basic Python usage within SPSS.  The simplest approach would be to use OMS to capture the necessary table as an SPSS dataset and then use Python or VBA code to read those values and generate SPSS macros holding the values for use in SPSS syntax.  The macro language is documented in the Command Syntax Reference - see the DEFINE command.

    --





  • 5.  RE: Extracting specific values from SPSS Tables

    Posted 16 days ago

    Thank you for your advice.  I will check out Python per your guidance, and review the documentation.

    Thank you!

    Mark

    Georgia Tech



    ------------------------------
    Mark Sloan
    ------------------------------



  • 6.  RE: Extracting specific values from SPSS Tables

    Posted 16 days ago

    I reviewed Help > Docs in PDF Format and found one Python Reference Guide for IBM SPSS Statistics listed twice.  Is this enough to use Python in SPSS?

    To help clarify all of this and another thread, here is what I hoped to accomplish in SPSS syntax alone, without Excel:

    FREQUENCIES VARIABLES=    ses_RESID_M2                                    
      /FORMAT=NOTABLE                                        
      /NTILES=4                                        
      /ORDER=ANALYSIS.                

    ----> Get 25th quartile value (VAR_Q1) and 75th quartile value (VAR_Q3) and compute this:

    VAR_IQR = VAR_Q3 MINUS VAR_Q1

    FAROUT = VAR_IQR x 3.0

    UPPEROUTERFENCE = VAR_Q3 PLUS FAROUT

    LOWEROUTERFENCE = VAR_Q1 MINUS FAROUT

    ----> Now do this:

    COMPUTE ses_RESID_M2OUT = ses_RESID_M2.

    EXECUTE.

    RECODE ses_RESID_M2OUT (lo thru LOWEROUTERFENCE = 999999999).

    ADD VALUE LABELS ses_RESID_M2OUT 999999999 '(Recoded low extreme values)'.

    MISSING VALUES ses_RESID_M2OUT (UPPEROUTERFENCE thru hi).

    ----> I think this is about it!  Could there be a way to do this in SPSS alone?  I will continue to review the Python documentation, but if there are any SPSS syntax capabilities to do the above it would be greatly appreciated!

    Thank you,

    Mark Sloan

    PhD Candidate at Georgia Tech



    ------------------------------
    Mark Sloan
    ------------------------------



  • 7.  RE: Extracting specific values from SPSS Tables

    IBM Champion
    Posted 16 days ago
    I have an example syntax file that calculates quantiles and defines macros that are then used in regular SPSS code.  Send me an email (jkpeck@gmail.com), and I'll reply with the code.

    Usage notes:
    Your main dataset is assumed to have dataset name main.
    You run the first begin program block just once and then run the second one for each variable as needed.

    Following that is an example of a SELECT IF that uses the macros defined in step 2.

    The expository manual I was referring to is here.



    --





  • 8.  RE: Extracting specific values from SPSS Tables

    Posted 16 days ago

    I just sent you an email.  I'll check it out pronto!  Thank you for your help.



    ------------------------------
    Mark Sloan
    ------------------------------



  • 9.  RE: Extracting specific values from SPSS Tables

    Posted 16 days ago

    Thank you for all your help today.  I'm almost there.  I successfully created upper and lower IQR fence variables for my Level-1 and Level-2 residuals, and they appear to have calculated correctly.  My last hurdle is how to use their single fence values (the same value is listed 6,000 times for each new variable) to work in my missing value syntax.  Here is what I want:

    MISSING VALUES ses_RESID_M2 (L1_RES_UPOUT thru hi).

    ...where ses_RESID_M2 is the residual variable that I'm trying to remove outliers from, and L1_RES_UPOUT is the upper IQR fence for the Level-1 residuals. 

    I also tried the following in Python to see if I could create a macro for a new single value variable based on the new variable just created:

    begin program python3.

    spss.SetMacroValue("!L1resUPOUT", L1_RES_UPOUT)

    end program.

    MISSING VALUES ses_RESID_M2 (!L1resUPOUT thru hi).

    I'm still not sure how to get that single value out of Python that would make the MISSING VALUES syntax successful.  The dataset name was changed to MAIN previously, so that shouldn't be the problem.  Thank you for your help on what is probably a very simple fix.



    ------------------------------
    Mark Sloan
    ------------------------------



  • 10.  RE: Extracting specific values from SPSS Tables

    IBM Champion
    Posted 16 days ago
    The code I sent you included defining a macro named !q75 for the 75th percentile so you can just use that in the MISSING VALUES statement
    MISSING VALUES ses_RESID_M2 (!q75 thru hi).
    if I understand correctly what you want.

    --





  • 11.  RE: Extracting specific values from SPSS Tables

    Posted 16 days ago

    Thanks! While the !q75 value itself would not work, would it be possible to just substitute the formula for the upper fence in its place: (!q75+(!iqr*3.0) thru hi)?



    ------------------------------
    Mark Sloan
    ------------------------------



  • 12.  RE: Extracting specific values from SPSS Tables

    IBM Champion
    Posted 16 days ago
    No, but you could add a macro definition for that:
    spss.SetMacroValue("!xyz", qs[3][0] +qs[4][0]-qs[2][0]) * 3)


    --





  • 13.  RE: Extracting specific values from SPSS Tables

    Posted 16 days ago

    Success!  Doing the math within Python did the trick.  I am now able to successfully create IQR upper and lower fences in SPSS.

    One caveat is that there seems to be different precision between taking SPSS IQR values into Excel and calculating upper/lower fences, and using the Python-macro method.  For example, where the SPSS--->Excel method produced 5.58225, the Python--->SPSS method produced 5.58223.  Another example:  SPSS-->Excel = -5.39662, while Python--->SPSS = -5.39660.  Maybe not a big deal but could lead to capturing/not-capturing a certain case here or there.  Which method would you trust more?  

    It would be nice if SPSS syntax could do direct calculations of single-value variables that would eliminate reliance on Excel or Python. 

    Again, thank you for all your help.  For now it appears I'm back on track.



    ------------------------------
    Mark Sloan
    ------------------------------



  • 14.  RE: Extracting specific values from SPSS Tables

    IBM Champion
    Posted 15 days ago
    There are many slightly different ways to define percentiles.  For example, below are the options that EXAMINE offers, so it isn't surprising that SPSS and Excel might use different definitions.


    For each of the following methods of percentile calculation, w is the sum of the weights for all nonmissing cases, p is the specified percentile divided by 100, and X i is the value of the ith case (cases are assumed to be ranked in ascending order). For details on the specific formulas used, see the algorithms documentation included on the installation CD.

    HAVERAGE. Weighted average at X(w + 1)p. The percentile value is the weighted average of X i and X i + 1, where i is the integer part of (w + 1)p. This is the default if PERCENTILES is specified without a keyword.

    WAVERAGE. Weighted average at Xwp. The percentile value is the weighted average of X i and X (i + 1), where i is the integer portion of wp.

    ROUNDObservation closest to wp. The percentile value is X i or X i + 1, depending upon whether i or i + 1 is "closer" to wp.

    EMPIRICALEmpirical distribution function. The percentile value is X i, where i is equal to wp rounded up to the next integer.

    AEMPIRICALEmpirical distribution with averaging. This is equivalent to EMPIRICAL, except when i=wp, in which case the percentile value is the average of X i and X i + 1.


    --





  • 15.  RE: Extracting specific values from SPSS Tables

    Posted 15 days ago

    Thank you for the clarification.  I'll stick with what Python produced because the differences are so small.  I was just able to run all 4 sections together as one consolidated syntax in SPSS for the first time without any interruptions and without any errors. On my quite fast computer SPSS accomplished all tasks in 16 seconds.

    I do have one remaining question regarding datasets being named "MAIN" in Python.  SPSS default is to use "Dataset1".  I saw "MAIN" once in the Python code.  Could this be merely changed to "Dataset1" to negate the need to rename the dataset?

    Almost there!



    ------------------------------
    Mark Sloan
    ------------------------------



  • 16.  RE: Extracting specific values from SPSS Tables

    IBM Champion
    Posted 15 days ago
    The dataset name can be anything.  It has to have a name, but for simplicity I just made an assumption.  Just remember that if you open a second dataset in the same session, itcwill getcthe name Dataset2.  The code could be modified to discover and use the actual name.--





  • 17.  RE: Extracting specific values from SPSS Tables

    Posted 14 days ago

    Thank you for all your help!



    ------------------------------
    Mark Sloan
    ------------------------------



  • 18.  RE: Extracting specific values from SPSS Tables

    Posted 16 days ago

    This would be a trivial task if AGGREGATE included functions to return Q1 and Q3. 

    AGGREGATE
      /OUTFILE=* MODE=ADDVARIABLES
      /BREAK=
      /Var_Q1=Q1(Var)
      /Var_Q2=MEDIAN(Var) 
      /Var_Q3=Q3(Var).

    But given the current tendency to farm everything out to Python, I'm not optimistic about that happening any time soon.  :-( 

    PS- One could get a rough approximation by using AGGREGATE to get the median, and then using AGGREGATE twice more to get Q1 as the median of values less than the median and Q3 as the median of the values greater than the median.

    AGGREGATE
      /OUTFILE=* MODE=ADDVARIABLES
      /BREAK=
      /Var_Q2=MEDIAN(Var).

    * Q1 = median of values < median.
    TEMPORARY.
    SELECT IF Var LT Var_Q2.
    AGGREGATE
      /OUTFILE=* MODE=ADDVARIABLES
      /BREAK=
      /Var_Q1=MEDIAN(Var).

    * Q3 = median of values > median.
    TEMPORARY.
    SELECT IF Var GT Var_Q2.
    AGGREGATE
      /OUTFILE=* MODE=ADDVARIABLES
      /BREAK=
      /Var_Q3=MEDIAN(Var).

    DESCRIPTIVES Var_Q1 Var_Q2 Var_Q3.
    FREQUENCIES VARIABLES= Var                                    
      /FORMAT=NOTABLE                                        
      /NTILES=4                                        
      /ORDER=ANALYSIS. 



    ------------------------------
    Bruce Weaver
    ------------------------------



  • 19.  RE: Extracting specific values from SPSS Tables

    IBM Champion
    Posted 16 days ago
    That was my first thought, but it was pretty easy to accomplish this with a small bit of Python code.  It would be nice to have a general formula option for AGGREGATE, but that procedure is streamlined for the functions already provided, so I suspect it would be hard to add such functionality without seriously affecting performance.  AGGREGATE is very fast and runs without the need to sort the data.

    --