Evaluation metrics & Model Selection in Linear Regression

In this article, we shall go over the most common evaluation metrics in Linear Regression and also model selection strategies.

Residual plots — Before evaluation of a model

We know that linear regression tries to fit a line that produces the smallest difference between predicted and actual values, where these differences are unbiased as well. This difference or error is also known as residual. (Unbiased means there is no systematic pattern of distribution of the predicted values)Residual = actual value — predicted valuee = y — _ŷ_It is important to note that, before assessing or evaluating our model with evaluation metrics like R-squared, we must make use of residual plots.**Residual plots expose a biased model than any other evaluation metric. If your residual plots look normal, go ahead, and evaluate your model with various metrics.**Residual plots show the residual values on the y-axis and predicted values on the x-axis. If your model is biased you cannot trust the results.Residual plot showing the errors corresponding to the predicted values must be randomly distributed. However, if there are any signs of a systematic pattern, then your model is biased.But what does it mean by randomly distributed errors?

One of the assumptions of a linear regression model is that the errors must be normally distributed. This means, make sure your residuals are distributed around zero for the entire range of predicted values. Thus, if the residuals are evenly scattered, then your model may perform well.

#data-science #regression #statistics #machine-learning #artificial-intelligence

Residual plots — Before evaluation of a model

towardsdatascience.com

Evaluation metrics & Model Selection in Linear Regression