Global AI and Data Science

View Only

Expand all | Collapse all

Is low correlation (with a target) enough to dismiss a feature from a baseline model?

1. Is low correlation (with a target) enough to dismiss a feature from a baseline model?

0 Like
Marco Aurelio Sánchez Sorondo
Posted Mon November 02, 2020 07:20 AM

Reply
I'm wondering if there is any reason to keep a feature from a dataset in order to perform prediction even if it doesn't have a significant correlation to the target.
Has anyone been able to take advantage of such a feature?

Thanks

------------------------------
[Marco] [Sánchez Sorondo]
[UBA]
------------------------------

#GlobalAIandDataScience
#GlobalDataScience
2. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

1 Like
Rick Marcantonio
Posted Tue November 03, 2020 09:11 AM
Edited by System Fri January 20, 2023 04:40 PM

Reply
"I'm wondering if there is any reason to keep a feature from a dataset in order to perform prediction even if it doesn't have a significant correlation to the target?"

There might be; it's hard to say, given this level of information. Personally (and assuming that all assumptions for such analyses are met and/or dealt with), if I am looking at some kind of regression model, I am looking not at the Pearson correlation but at the part and partial correlations. I encourage you to take a look at some sources for those if you are unfamiliar with them. For example: https://www.statisticssolutions.com/what-are-zero-order-partial-and-part-correlations

You can obtain these in the REGRESSION procedure of SPSS Statistics by requesting /STATISTICS ZPP (zero, part, and partial correlations)

Also, consider in your model the role of variable; whether a mediating or moderating variable, for example. See sources like https://www.statisticshowto.com/mediator-variable/

Apart from these, sometimes it's interesting to know what variables (that I thought would be) are NOT related in the way I thought they were, at least in my data sample.

Rick M

------------------------------
Rick Marcantonio Quality Assurance
Quality Assurance
IBM
IL
------------------------------

Original Message
3. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

0 Like
Marco Aurelio Sánchez Sorondo
Posted Thu November 05, 2020 09:41 PM

Reply
Interesting... Didn't know those correlations existed. I'm not much of and SPSS guy but R seems to have functions that compute them. Good tools for making the analysis more complete.

------------------------------
[Marco] [Sánchez Sorondo]
[UBA]
------------------------------

Original Message
4. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

1 Like
Bahareh Atoufi
Posted Tue November 03, 2020 05:14 PM

Reply
Hi Marco,
it depends on the nature of the data and the feature you're looking at. I've worked with time series where a feature might be a good predictor due to some time-domain or frequency domain attributes of it. Correlation only shows the linear dependency between the two variables but there might be a non-linear dependency that will be missed by only looking at correlation. looking at entropy, mutual information, etc. Also, there might be components inside of that feature that could have useful information about the target. doing a PCA and looking at each component separately and getting rid of the less important components might help. In general, it depends on what kind of data you're looking at.

------------------------------
Bahareh Atoufi
------------------------------

Original Message
5. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

1 Like
Paweł Niklewicz
Posted Wed November 04, 2020 07:48 AM

Reply
Hi Marco,

sometimes correlation can be non-linear. So, before you reject this future just check tranasformation with log(x) or something similar.

best regards

Pawel

------------------------------
Paweł Niklewicz
------------------------------

Original Message
6. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

0 Like
Marco Aurelio Sánchez Sorondo
Posted Fri November 06, 2020 07:24 AM

Reply
Do you mean the entire feature set or just one feature?

------------------------------
[Marco] [Sánchez Sorondo]
[UBA]
------------------------------

Original Message
7. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

0 Like
Matthias Jungbauer
Posted Mon November 09, 2020 05:45 AM

Reply
What do you mean with low correlation?
If you are thinking about dismissing a feature, what made you add the feature to the model in the first place?

------------------------------
Matthias Jungbauer
------------------------------

Original Message
8. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

0 Like
Marco Aurelio Sánchez Sorondo
Posted Mon November 09, 2020 12:04 PM

Reply
That's the question: Should I add the feature or not?
With low correlation I mean computing the pearson correlation between the feature and the target.

------------------------------
[Marco] [Sánchez Sorondo]
[UBA]
------------------------------

Original Message
9. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

0 Like
Matthias Jungbauer
Posted Tue November 10, 2020 05:53 AM

Reply
In my humble view low correlation alone is not enough to dismiss a feature.

------------------------------
Matthias Jungbauer
------------------------------

Original Message
10. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

1 Like
Franco Yair Benko
Posted Tue November 10, 2020 12:51 PM

Reply
Hello Marco,

Hope you were fine, about your question, a low linear correlation coefficient (Pearson Coeffi) is not enough information to drop a feature in the modeling phase.

(working with python)

If you are facing a Regression problem, one good approach is training an OLS model from the package StatsModels with all the continuous features and view the model.summary() report.

link:
model - https://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.OLS.html
scores - https://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.OLS.score.html#statsmodels.regression.linear_model.OLS.score

On the other hand, if you are facing a Classification problem with categorical features. Random Forest has a `feature importance`property that it would help.

links:
model - https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier

A good place to found more information about the correlation:
http://campus.murraystate.edu/academic/faculty/cmecklin/STA565/_book/correlations-multiple-and-partial.html

------------------------------
Franco Yair Benko
------------------------------

Original Message
11. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

0 Like
Marco Aurelio Sánchez Sorondo
Posted Tue November 10, 2020 08:39 PM

Reply
Thanks! Good opportunity to try statsmodels as well... Never used it.

------------------------------
[Marco] [Sánchez Sorondo]
[UBA]
------------------------------

Original Message

Global AI and Data Science

Is low correlation (with a target) enough to dismiss a feature from a baseline model?

Marco Aurelio Sánchez SorondoMon November 02, 2020 07:20 AM

Rick MarcantonioTue November 03, 2020 09:11 AM

Marco Aurelio Sánchez SorondoThu November 05, 2020 09:41 PM

Bahareh AtoufiTue November 03, 2020 05:14 PM

Paweł NiklewiczWed November 04, 2020 07:48 AM

Marco Aurelio Sánchez SorondoFri November 06, 2020 07:24 AM

Matthias JungbauerMon November 09, 2020 05:45 AM

Marco Aurelio Sánchez SorondoMon November 09, 2020 12:04 PM

Matthias JungbauerTue November 10, 2020 05:53 AM

Franco Yair BenkoTue November 10, 2020 12:51 PM

Marco Aurelio Sánchez SorondoTue November 10, 2020 08:39 PM

1. Is low correlation (with a target) enough to dismiss a feature from a baseline model?

2. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

3. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

4. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

5. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

6. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

7. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

8. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

9. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

10. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?

11. RE: Is low correlation (with a target) enough to dismiss a feature from a baseline model?