# SPSS Statistics

View Only

## Linear Regression and comparison of multiple dependent variables

• #### 1.  Linear Regression and comparison of multiple dependent variables

Posted Thu October 14, 2021 12:58 PM
How would I go about using SPSS to bulk-compare variables to demographics, with the goal of finding variables that are statistically significant (P-value and/or correlation) only for certain subsets of demographics?

For example, say I had variables A, B, C and demographics X, Y, Z, how would I prove that A is relevant to X and Y but not Z, B is relevant to Y only, and C is relevant to X+Z together but not individually, etc?

In theory this could be done by running a linear regression report on each dependent variable then comparing the results, and that's fine with only three, but I have over 100 variables and ten sets of demographic questions each with up to ten possible responses, so I really don't want to do this manually if there's any other way?

I suspect part of the method will involve recategorising some of my inputs from scalar to ordinal, but what comes after that?

------------------------------
David Stosser
------------------------------

• #### 2.  RE: Linear Regression and comparison of multiple dependent variables

Posted Fri October 15, 2021 08:50 AM
Hello

you can try to calculate the correlation coefficients between the variables.

compute f = x+z.

DEFINE !mylist () a b c x y z f
!ENDDEFINE.

CORRELATIONS
/VARIABLES=!mylist
/PRINT=TWOTAIL NOSIG FULL
/MISSING=PAIRWISE.

you will get in output window a matrix of correlation coefficients .

best regards,

------------------------------
xq
------------------------------

• #### 3.  RE: Linear Regression and comparison of multiple dependent variables

Posted Fri October 15, 2021 09:19 PM
I don't think a correlation matrix would much help.
I suggest that you try the NAIVE BAYES(Analyze > Classify > Naive Bayes) procedure.  It is designed to pick out the best predictors from a large set of possibilities.  You can specify candidate factors and candidate covariates, but it works one dependent variable at a time.

It looks at each potential predictor in isolation, so it can be used with large sets of variables.  You can specify that a portion of the cases should be randomly assigned to a training set.  There are other options as well.

------------------------------
Jon Peck
------------------------------