Linear regression is a statistical data analysis technique that helps you generate predictions for your custom data by priorly training the model on some dataset at hand.

Let’s decompose this by taking an example.

Suppose I have the data for the price of a house, given its area.

Now, if I wish to estimate the price of a house that has an area of about 400 units, how can i predict this? This is where linear regression will come to rescue. We will make a model that learns from the existing dataset, and predicts the quantity for a new test point that we enter.

Here, we are mainly considering one feature of the house i.e. area, thus -single feature regression. If there are more features that determine the quantity that needs to be predicted(price here), we will have to change our strategy a bit. For this article, we will stick to a single feature.

Now that we have developed a sense of the aim, let’s pick another example. We have a dataset that consists of the time spent on learning calculus and the resulting performance score. The data looks like this -

for demo purpose

Let’s visualise the data by plotting a graph.

Now, we wish to find the performance of a guy who devoted 9 hours for calculus. Can you come up with a way to do so by looking at the graph?

We can do it by plotting a line, that best fits these points. In this way, if we enter a new point, our model just puts the coordinates in the equation of the line and we get the required prediction. This is bang on! But we need to analyse what makes a line best for our model. For that, let’s dig deeper to know what are the parameters that characterise a line. We know that the equation of a line is -

y=mx+c

We can also put this as -

#regression #data #linear-regression #machine-learning #data-science #data analysis

y=mx+c

medium.com

Linear regression introduction — single feature