4 Common Mistakes Everybody Makes With Regressions

4 Common Mistakes Everybody Makes With Regressions

Linear regressions are among the most common and most powerful tools for data analysis. While other, more advanced forms of statistics have been developed over the years, linear regressions remain incredibly popular, because they’re easy to understand, interpret, and perform.

Linear regressions are among the most common and most powerful tools for data analysis. While other, more advanced forms of statistics have been developed over the years, linear regressions remain incredibly popular, because they’re easy to understand, interpret, and perform.

You can find regression implementations in nearly any programming language, analytical software, and even the standard TI-84 calculator. Its ubiquity allows math teachers to introduce it as early as middle school, meaning most people are at least familiar with it.

With the linear regression’s success, however, comes its misuse. As people may not completely understand its underlying assumptions, they’re more likely to use a make basic mistakes when applying it.

Luckily, some of those mistakes are easy to fix.

Fitting to Non-Linear Data

A Line of Best Fit on non-linear data. Figure produced by author.

Despite “linear” being in the name, one of the most common mistakes in linear regressions is fitting to non-linear data. The illustration above shows why this is a bad idea.

The straight line, the linear regression, doesn’t follow the curve of the data that it’s designed to mimic. As a result, the model behaves poorly and makes terrible predictions.

Nearly everybody does this at least once because they don’t take the time to do proper data exploration. Fitting each of the independent variables to check for a linear relationship, calculating correlation coefficients, or performing a principal component analysis can help prevent this mistake in the first place.

The best solution, however, is to check what type of relationship X has with Y and perform a transformation on X to fit to Y. For example, if the data forms a parabolic relationship, like in the example above, use X² as the independent variable instead of X.

machine-learning data-science linear-regression data-analysis regression

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Linear Regression Model for Machine Learning

An overview of the oldest supervised machine-learning algorithm, its type & shortcomings.

Linear Regression For Data Science

Linear regression is commonly used to quantify the relationship between two or more variables. It is also used to adjust for confounding.

15 Machine Learning and Data Science Project Ideas with Datasets

Learning is a new fun in the field of Machine Learning and Data Science. In this article, we’ll be discussing 15 machine learning and data science projects.

Baby Steps Towards Data Science: Multiple Linear Regression in Python

How to implement multiple linear regression and interpret the results. Source code and interesting basketball player dataset has been provided.

The Complete Guide to Linear Regression Analysis

In this article, we will analyse a business problem with linear regression in a step by step manner and try to interpret the statistical terms at each step to understand its inner workings.