SPSS Statistics

SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers! 

 View Only
Expand all | Collapse all

How can I obtain summary statistics for a logistic regression run on multiply imputed data?

  • 1.  How can I obtain summary statistics for a logistic regression run on multiply imputed data?

    Posted Thu April 15, 2021 12:57 AM

    I have used the multiple imputation analysis in SPSS 27 to create an imputed dataset (5 iterations). I have run a binary logistic regression model on the imputed dataset. However, there are no summary model statistics (such as the -2 log likelihood, Hosmer & Lemeshow test) or Wald statistics for regression coefficients for the pooled data. Instead these statistics are presented only for the original data and the 5 separate imputed datasets. How can I calculate or obtain these statistics for the pooled data?






    #SPSSStatistics
    #Support
    #SupportMigration


  • 2.  RE: How can I obtain summary statistics for a logistic regression run on multiply imputed data?

    Posted Thu April 29, 2021 03:48 PM

    There is no pooled data with multiple imputation. Pooling is done on statistics calculated from each of the imputed data sets, most commonly model parameters. The primary purpose is to obtain more accurate variance estimates for these estimated parameters. For Binary Logistic Regression (the LOGISTIC REGRESSION procedure), the B coefficients and associated single degree of freedom tests in the Variables in the Equation table are all that is curently pooled.

    In order to use Rubin's rules to pool estimates across analyses on imputed data sets, you have to have point estimates and variance estimates for those point estimates for each analysis. Since there is no variance estimate associated with the -2 log-likelihood statistic, these rules could not be applied. The same is true of the Hosmer-Lemeshow statistic, as far as I can tell. (If anyone reading this knows of a way to adapt Rubin's approach to something like a HL statistic, please enlighten me.)

    You could manually apply what we refer to as naive pooling, simply averaging the values for the -2 log-likelihood and the HL statistic over the analyses on imputed data for descriptive purposes. The Output Management System (OMS) features in SPSS Statistics make capturing output values as data fairly simple.






    #SPSSStatistics
    #Support
    #SupportMigration