if your goal is to pick out the best small number of variables out of 1000 purely on statistical grounds, remember that you will get a lot of false positives checking each one. You are likely to find a set by chance that will not generalize, so you need to have estimation and testing samples or at least do cross validation. How much data do you have?
For starters with so many variables, I suggest using the naive Bayes procedure to pick out a small number of the best individual variables. From there, there are many statistical procedures depending on the nature of the variables and the amount of data you have.
Original Message:
Sent: 11/13/2024 5:06:00 PM
From: Kameron Yiu
Subject: Mutual Information or Entropy value calculation
Hi,
Does anyone know if there is any SPSS code that you can share to calculate either "Mutual Information" value or "Entropy" value?
I have a dataset including target variable and 1000+ independent variables. I would like to find out which independent variables (or rank the independent variables) are predictive to the target variable.
Thanks,
Kameron
------------------------------
Kameron Yiu
------------------------------