Kernel Regression from Scratch in Python

Every beginner in Machine Learning starts by studying what regression means and how the linear regression algorithm works. In fact, the ease of understanding, explainability and the vast effective real-world use cases of linear regression is what makes the algorithm so famous. However, there are some situations to which linear regression is not suited. In this article, we will see what these situations are, what the kernel regression algorithm is and how it fits into the scenario. Finally, we will code the kernel regression algorithm with a Gaussian kernel from scratch. Basic knowledge of Python and numpy is required to follow the article.

Brief Recap on Linear Regression

Given data in the form of _N _feature vectors x=[_x_₁, x_₂, …, x_ₙ] consisting of n features and the corresponding label vector y, linear regression tries to fit a line that best describes the data. For this, it tries to find the optimal coefficients _c_ᵢ, _i_∈{0, …, n} of the line equation _y _= _c_₀ _+ c_₁_x_₁+_c_₂_x_₂+…+_c_ₙ_x_ₙ usually by gradient descent with the model accuracy measured on the RMSE metric. The equation obtained is then used to predict the target _y_ₜ for new unseen input vector _x_ₜ.

Linear regression is a simple algorithm that cannot model very complex relationships between the features. Mathematically, this is because well, it is linear with the degree of the equation being 1, which means that linear regression will always model a straight line. Indeed, this linearity is the weakness of the linear regression algorithm. Why?

Well, let’s consider a situation where our data doesn’t have the form of a straight line: let’s take data generated using the function _f(x) = x³. _If we use linear regression to fit a model to this data, we will never get anywhere close to the true cubic function because the equation for which we are finding the coefficients does not have a cubic term! So, for any data not generated using a linear function, linear regression is very likely to underfit. So, what do we do?

We can use another type of regression called polynomial regression which tries to find optimal coefficients of a (as the name suggests) polynomial equation with the degree of the equation being n, _n_⪈1. However, with polynomial regression another problem arises: as a data analyst, you cannot know what the degree of the equation should be so that the resulting equation fits best to the data. This can only be determined by trial and error which is made more difficult by the fact that above degree 3, the model built using polynomial regression is difficult to visualize.

This is where kernel regression can come to the rescue!

#machine-learning #regression #data-science

Brief Recap on Linear Regression

towardsdatascience.com

Kernel Regression from Scratch in Python