Skip main navigation (Press Enter).
Log in
Toggle navigation
Log in
Community
Topic Groups
Champions
Directory
Program overview
Rising Champions
IBM Champions group
User Groups
Directory
Benefits
Events
Dev Days
Conference
Community events
User Groups events
All TechXchange events
Participate
TechXchange Group
Welcome Corner
Blogging
Member directory
Community leaders
Resources
IBM TechXchange
Community
Conference
Events
IBM Developer
IBM Training
IBM TechXchange
Community
Conference
Events
IBM Developer
IBM Training
Global AI and Data Science
×
Global AI & Data Science
Train, tune and distribute models with generative AI and machine learning capabilities
Group Home
Threads
4K
Blogs
909
Events
0
Library
370
Members
28.4K
View Only
Share
Share on LinkedIn
Share on X
Share on Facebook
Back to Blog List
R Squared the Coefficient of Determination
By
Moloy De
posted
Thu February 18, 2021 08:19 PM
Like
In statistics, the coefficient of determination, denoted R
2
or r
2
is the proportion of the variance in the dependent variable that is predictable from the independent variables.
It is a statistic used in the context of statistical models whose main purpose is either the prediction of future outcomes or the testing of hypotheses, on the basis of other related information. It provides a measure of how well observed outcomes are replicated by the model, based on the proportion of total variation of outcomes explained by the model.
There are several definitions of R
2
that are only sometimes equivalent. One class of such cases includes that of simple linear regression where r
2
is used instead of R
2
. When an intercept is included, then r
2
is simply the square of the sample correlation coefficient between the observed outcomes and the observed predictor values. If additional regressors are included, R
2
is the square of the coefficient of multiple correlation. In both such cases, the coefficient of determination ranges from 0 to 1.
In all instances where R
2
is used, the predictors are calculated by ordinary least-squares regression: that is, by minimizing sum of squares. In this case, R
2
increases as the number of variables in the model is increased. R
2
is monotone increasing with the number of variables included—it will never decrease. This illustrates a drawback to one possible use of R
2
, where one might keep adding variables to increase the R
2
value. For example, if one is trying to predict the sales of a model of car from the car's gas mileage, price, and engine power, one can include such irrelevant factors as the first letter of the model's name or the height of the lead engineer designing the car because the R
2
will never decrease as variables are added and will probably experience an increase due to chance alone.
This leads to the alternative approach of looking at the adjusted R
2
.
The explanation of this statistic is almost the same as R
2
but it penalizes the statistic as extra variables are included in the model. For cases other than fitting by ordinary least squares, the R
2
statistic can be calculated as above and may still be a useful measure. If fitting is by weighted least squares or generalized least squares, alternative versions of R2 can be calculated appropriate to those statistical frameworks, while the "raw" R
2
may still be useful if it is more easily interpreted. Values for R
2
can be calculated for any type of predictive model, which need not have a statistical basis.
QUESTION I : Could R Squared be negative?
QUESTION II : When analyzing a Time Series data is it justified to calculate R Squared as the square of the correlation Coefficient be Actuals and Predicted?
REFERENCE :
Wikipedia
#Featured-area-3
#Featured-area-3-home
#GlobalAIandDataScience
#GlobalDataScience
0 comments
7 views
Permalink
Copy
https://community.ibm.com/community/user/blogs/moloy-de1/2021/02/02/points-to-ponder
Powered by Higher Logic