SPSS Statistics

SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers! 

 View Only
  • 1.  ADP Prepare Data for Modelling Outlier Formula

    Posted Wed May 08, 2024 09:15 PM

    Does anyone know how SPSS Automated Data Prep Module recalculates the mean and standard deviation before they apply the Standard Deviation cutoff value indicated.  The syntax uses a different Mean and StDev from the actual variable total mean and STDEV before is adds and subtracts the StDev. We don't know how that mean and STDev are calculated and need to explain to a client.  Any help would be great or direct me to someone who may know.

    The syntax with lower mean and St Dev looks as if they are trimming or applying some outlier detection even before the application of the St Dev rule applied. See code below where the values used are not the total original means and StDev for the variable.

    *Interactive Data Preparation.
    COMPUTE #MBR_SAV_SUMMARY_TOTAL_TRXN_CREDIT_60DAYS_AMT_Outlier = $SYSMIS.
    DO IF (MBR_SAV_SUMMARY_TOTAL_TRXN_CREDIT_60DAYS_AMT < 1617.62946991648-3508.74610065027*3).
    COMPUTE #MBR_SAV_SUMMARY_TOTAL_TRXN_CREDIT_60DAYS_AMT_Outlier = 1617.62946991648-3508.74610065027*3.
    ELSE IF (MBR_SAV_SUMMARY_TOTAL_TRXN_CREDIT_60DAYS_AMT > 1617.62946991648+3508.74610065027*3).
    COMPUTE #MBR_SAV_SUMMARY_TOTAL_TRXN_CREDIT_60DAYS_AMT_Outlier = 1617.62946991648+3508.74610065027*3.
    ELSE.
    COMPUTE #MBR_SAV_SUMMARY_TOTAL_TRXN_CREDIT_60DAYS_AMT_Outlier = MBR_SAV_SUMMARY_TOTAL_TRXN_CREDIT_60DAYS_AMT.
    END IF.
    COMPUTE MBR_SAV_SUMMARY_TOTAL_TRXN_CREDIT_60DAYS_AMT_transformedTest = (((1/3176.06783869562)*(#MBR_SAV_SUMMARY_TOTAL_TRXN_CREDIT_60DAYS_AMT_Outlier-1669.24263701691))+0).
    VARIABLE ROLE
      /NONE MBR_SAV_SUMMARY_TOTAL_TRXN_CREDIT_60DAYS_AMT
      /INPUT MBR_SAV_SUMMARY_TOTAL_TRXN_CREDIT_60DAYS_AMT_transformedTest.
    EXECUTE.



    ------------------------------
    MARTHA REA
    ------------------------------



  • 2.  RE: ADP Prepare Data for Modelling Outlier Formula

    Posted Wed May 08, 2024 09:43 PM

    The calculations would depend on the data settings.  If you are using ADP, you can find the details in the Algorithms Manual under Automated Data Preparation Algorithms.  The manual is available via Help > Doc in PDF Format.

    I presume that the interactive mode for data prep, which doesn't use ADP directly, would use equivalent code.

    I should point out that there is also an extension command, STATS PREPROCESS (Data > Preprocess Variables), which you can install via Extensions > Extension Hub, that provides a number of related capabilities for standardization and distribution adjustment.

    --