Weighted Least Squares

Table of Contents

Weighted Least Square is an estimate used in regression situations where the error terms are heteroscedastic or has non constant variance.

To get a better understanding about Weighted Least Squares, lets first see what Ordinary Least Square is and how it differs from Weighted Least Square.

What is Ordinary Least Square(OLS)?

In a simple linear regression model of the form,

where

$y\nocache$ is the independent variable

$x_{i} \nocache$ is the independent variable

$_{\beta_{0}} \nocache$ and $_{\beta_{1}} \nocache$ are the regression coefficients

$\epsilon\nocache$ is the random error or the residual.

The goal is to find a line that best fits the relationship between the outcome variable $y\nocache$ and the input variable $x_{i} \nocache$ . With OLS, the linear regression model finds the line through these points such that the sum of the squares of the difference between the actual and predicted values is minimum.

i.e., to find $_{\beta_{0}} \nocache$ and $_{\beta_{1}} \nocache$ such that

$\sum_{i=1}^{n}[y_{i} - (\beta_{0} + \beta_{1}x_{i})]^2\nocache$

is minimum.

In such linear regression models, the OLS assumes that the error terms or the residuals (the difference between actual and predicted values) are normally distributed with mean zero and constant variance. This constant variance condition is called homoscedasticity.

If this assumption of homoscedasticity does not hold, the various inferences made with this model might not be true.

To check for constant variance across all $y\nocache$ values along the regression line, a simple plot of the residuals and the fitted outcome values and the histogram of residuals such as below can be used.

In an ideal case with normally distributed error terms with mean zero and constant variance , the plots should look like this.

Residuals vs Fitted Values Plot

histogram of residuals

From the above plots its clearly seen that the error terms are evenly distributed on both sides of the reference zero line proving that they are normally distributed with mean=0 and has constant variance.

The histogram of the residuals also seems to have datapoints symmetric on both sides proving the normality assumption.

In some cases, the variance of the error terms might be heteroscedastic, i.e., there might be changes in the variance of the error terms with increase/decrease in predictor variable.

In those cases of non-constant variance Weighted Least Squares (WLS) can be used as a measure to estimate the outcomes of a linear regression model.

Now let’s see in detail about WLS and how it differs from OLS.

#python

What is Ordinary Least Square(OLS)?

honingds.com

Weighted Least Squares