Linear Regression is a machine learning algorithm that is used to predict a quantitative target, with the help of independent variables that are modeled in a linear manner, to fit a line or a plane (or hyperplane) that contains the predicted data points. For a second, let’s consider this to be the best-fit line (for better understanding). So, usually, points from the training data don’t really lie on the best-fit line only, and that makes perfect sense because any data isn’t perfect. That is why we are making predictions in the first place, and not just plotting a random line.

Understanding Bias

Image for post

The linear regression line cannot be curved in order to include all the training set data points, and hence is unable to capture an accurate relationship at times. This is called bias. In mathematical terms, intercept obtained in the linear regression equation, is the bias.

Why do I say that?

Let me explain: Here’s a random Linear Regression equation:

y = Intercept + Slope1x1 + Slope2x2

The target (y) has some values in the data-set, and the above equation calculates the predicted values for the same. If the “Intercept” itself is very high, and it reaches close to the predicted y values, then it would mean that the changes in y, caused by the other two parts of our equation — the independent variables(x1 and x2), would be less. This means that the amount of variance explained by x1 and x2, would be less, and that would eventually cause an underfitting model to be built. An underfitting model has a low R-squared (the amount of variance in the target, explained by the independent variables).

Underfit can also be understood by** thinking of how the best-fit line/plane is captured in the first place.** The best-fit line/plane captures the relationship between the target and the independent variable. If this relationship is captured to a very high extend, it leads to low bias and vice versa.

Now that we understand what bias is, and how a high bias causes an underfitting model, it becomes clear that for a robust model, we need to remove this underfit.

In a scenario where we create a curve that passes through all data points and can showcase the existing relationship between the independent variables and the dependant variable, then there would be no bias in the model.

Understanding Variance

Image for post

**A model that has overfitted on train data, will result in a new phenomenon called “variance”. **Time to consider a few models:

Model1:_ High Bias (Unable to capture the relationship properly)_

Model2:_ Low Bias (Captures relationship to a very high extent)_

Error measurement while validating a model:

Error = Actual Values — Predicted Values

On calculating the errors on the training data (test data is not in the picture yet), we observe the following:

Model1:_ Validation of model on train data shows that errors are high_

_Model2: _Validation of model on train data shows that errors are low

#technology #machine-learning #data-science #analytics #data #deep learning

Bias & Variance in Machine Learning
1.30 GEEK