Which Evaluation Metric Should You Use in ML Regression Problems?

If you’re like me, you might have used R-Squared (R²), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE )evaluation metrics in your regression problems without giving them a lot of thought. 🤔

Although all of them are common metrics, it’s not obvious which one to use when. After writing this article I have a new favorite and a new plan for reporting them going forward. 😀

I’ll share those conclusions with you in a bit. First, we’ll dig into each metric. You’ll learn the pros and cons of each for model selection and reporting. Let’s get to it! 🚀

Evaluation metrics are like bridges to understanding. 😀 source: pixabay.com

R-Squared (R²)

R² represents the proportion of variance explained by your model.

R² is a relative metric, so you can use it to compare with other models trained on the same data. And you can use it to get a rough a feel for how well a model performs, in general.

Disclaimer: This article isn’t a review of machine learning methods, but make sure you use different data for training, validation, and testing. You always want to hold out some data that your model has not seen to evaluate its performance. Also, it’s a good idea to look at plot of your model’s predictions vs. the actual values to see how well your model fit the data.

Evaluation Metrics for Regression Problems

Hi, today we are going to study about the Evaluation metrics for regression problems. Evaluation Metrics are very important as they tell us, how accurate our model is.

Before we proceed to the evaluation techniques, it is important to gain some intuition.

In the above image, we can see that we have plotted a linear curve, but the curve is not perfect as some points are lying above the line & some are lying below the line.

So, how accurate our model is?

The evaluation metrics aim to solve these problems. Now, without wasting time, let’s jump to the evaluation metrics & see the evaluation techniques.

There are 6 evaluation techniques:

1. M.A.E (Mean Absolute Error)

2. M.S.E (Mean Squared Error)

3. R.M.S.E (Root Mean Squared Error)

4. R.M.S.L.E (Root Mean Squared Log Error)

5. R-Squared

6. Adjusted R-Squared

Now, let’s discuss these techniques one by one.

M.A.E (Mean Absolute Error)

It is the simplest & very widely used evaluation technique. It is simply the mean of difference b/w actual & predicted values.

Below, is the mathematical formula of the Mean Absolute Error.

Mean Absolute Error

The Scikit-Learn is a great library, as it has almost all the inbuilt functions that we need in our Data Science journey.

Below is the code to implement Mean Absolute Error

from sklearn.metrics import mean_absolute_error

mean_absolute_error(y_true, y_pred)

Here, ‘y_true’ is the true target values & ‘y_pred’ is the predicted target values.

5 Regression algorithms: Explanation & Implementation in Python

Take your current understanding and skills on machine learning algorithms to the next level with this article. What is regression analysis in simple words? How is it applied in practice for real-world problems? And what is the possible snippet of codes in Python you can use for implementation regression algorithms for various objectives? Let’s forget about boring learning stuff and talk about science and the way it works.

Regression: Linear Regression

Machine learning algorithms are not your regular algorithms that we may be used to because they are often described by a combination of some complex statistics and mathematics. Since it is very important to understand the background of any algorithm you want to implement, this could pose a challenge to people with a non-mathematical background as the maths can sap your motivation by slowing you down.

Image for post

In this article, we would be discussing linear and logistic regression and some regression techniques assuming we all have heard or even learnt about the Linear model in Mathematics class at high school. Hopefully, at the end of the article, the concept would be clearer.

**Regression Analysis **is a statistical process for estimating the relationships between the dependent variables (say Y) and one or more independent variables or predictors (X). It explains the changes in the dependent variables with respect to changes in select predictors. Some major uses for regression analysis are in determining the strength of predictors, forecasting an effect, and trend forecasting. It finds the significant relationship between variables and the impact of predictors on dependent variables. In regression, we fit a curve/line (regression/best fit line) to the data points, such that the differences between the distances of data points from the curve/line are minimized.

