SPSS Statistics

 View Only
Expand all | Collapse all

LASSO regression - is alpha = lambda? + other questions

  • 1.  LASSO regression - is alpha = lambda? + other questions

    Posted Mon July 08, 2024 06:56 PM

    Dear community,

    currently I am computing lasso regression in SPSS. As this is a new function in version 29, I cannot really find information on lambda. Is alpha in SPSS lambda - the number your regulate on  how much shrinkage you want in the regression analysis?

    Or can I find lambda anywhere? Does anyone have a source on alpha in lasso regression in SPSS. 

    Like you should / can, I calculated the right "lambda" using k fold cross validation. Here is my R code:

    LINEAR_LASSO d61_d_empfaenglichkeit WITH a1_alter_mutter a1_sprache_d___2 income_KOV 
        CB_IPV social_support_mean Barrier_Sub_healthbeliefs PMAD_knowledge_sum recognition_ges 
        previous_and_current_experiences_dichotomisiert 
      /MODE CROSSVALID
      /ALPHA VALUES=0.01 TO 1 BY 0.01 METRIC=LINEAR
      /CRITERIA INTERCEPT=TRUE STANDARDIZE=TRUE TIMER=5.0 NFOLDS=5 STATE=0
      /PARTITION TRAINING=80 HOLDOUT=20.0
      /PRINT COMPARE
      /PLOT MSE R2.

    And I also have other questions, if someone knows a lot about LASSO: 

    • Can I use LASSO just to choose which predictors I then want to use for a multiple linear or logistic regression?
    • Are 10 predictors enough to calculate a LASSO regression? Or would i need more?

    If you have, please give sources to your answers! Thank you so much.

    Anna



    ------------------------------
    Anna Büssow
    ------------------------------


  • 2.  RE: LASSO regression - is alpha = lambda? + other questions

    IBM Champion
    Posted Mon July 08, 2024 07:24 PM
     Alpha is the penalty term for the shrinkage with 0 making lasso equivalent to OLS.

    That "R code" looks suspiciously like SPSS code :-)

    Using such a tight grid is probably overkill and will increase the runtime quite a lot.  You might start with a coarser grid and refine it once you get into the optimal neighborhood if the results are really sensitive to the alpha choice.

    Lasso is a good way to do variable selection, but it, like OLS, will struggle with multicollinearity, so don't expect miracles in that situation.

    I use the tactic you suggested of lasso for variable selection followed by traditional regression with the chosen variables.  This is probably better than stepwise regression, but remember that that linear regression on the chosen variables will have some of the same bias as stepwise due to having selected the model based on the data.

    You might also want to look at elastic net, which combines lasso and ridge.  Also, I like Shapley value regression for variable selection.  That is available in SPSS via the STATS RELIMP extension command, which can be installed via Extensions > Extension Hub.  One of its nice features is that it can show not only importance measures for all the regressors, but it can show how coefficients vary with model size (number of regressors), so you can see how sensitive the results are to that choice.  Of course, even there, sig values are dubious.  It would be interesting to compare its results with lasso.

    --