Rusty  Shanahan

Rusty Shanahan


Performance Metrics: Regression Model

Today we are going to discuss about Performance Metrics, and this time it will be Regression metrics. As in my previous blogwe have discussed about Classification Metrics, this time it’s Regression.

We are going to talk about 5 most widely used Regression metrics:

Let’s understand one thing first and that is the difference between Classification and Regression Metrics, why we need two different metrics to measure our models.

The first key difference is, Classification as the name suggest gives classes as output, which can be understood as we have few categories of data, say class 1–10, then our output value will be any number in between 1–10. So if the model output matches with my actual output then the result is passed otherwise failed, there is no other condition, you can either be correct or incorrect. While this is not the case of Regression, in regression my model outputs a continuous number, there is no discrete values, it’s continuous, like for example out model tries to predict height of people, we know we cannot classify the height variable as 160cm, or 170cm or etc… it is continuous, hence in this case we consider how close is our model to the actual value, the concept of “how close” gives rise to the term of loss, to put in proper scientific or statistical notation, what is the loss incurred by our model in predicting the value of a data point.

Let’s say, for a certain data point the height is predicted to be 167 cm whereas the actual data point has an actual value of 163 cm, then our model has made a mistake of +5cm in this case, now this I just for 1 data point imagine how to measure for the whole dataset?

Let’s keep one thing in mind, what is an Error?

Any deviation from the actual value is an error,

#data-science #statistical-analysis #statistics #data analysis

What is GEEK

Buddha Community

Performance Metrics: Regression Model
Macey  Kling

Macey Kling


Hierarchical Performance Metrics and Where to Find Them

Hierarchical machine learning models are one top-notch trick. As discussed in previous posts, considering the natural taxonomy of the data when designing our models can be well worth our while. Instead of flattening out and ignoring those inner hierarchies, we’re able to use them, making our models smarter and more accurate.

“More accurate”, I say — are they, though? How can we tell? We are people of science, after all, and we expect bold claims to be be supported by the data. This is why we have performance metrics. Whether it’s precision, f1-score, or any other lovely metric we’ve got our eye on — if using hierarchy in our models improves their performance, the metrics should show it.

Problem is, if we use regular performance metrics — the ones designed for flat, one-level classification — we go back to ignoring that natural taxonomy of the data.

If we do hierarchy, let’s do it all the way. If we’ve decided to celebrate our data’s taxonomy and build our model in its image, this needs to also be a part of measuring its performance.

How do we do this? The answer lies below.

Before We Dive In

This post is about measuring the performance of machine learning models designed for hierarchical classification. It kind of assumes you know what all those words mean. If you don’t, check out my previous posts on the topic. Especially the one introducing the subject. Really. You’re gonna want to know what hierarchical classification is before learning how to measure it. That’s kind of an obvious one.

Throughout this post, I’ll be giving examples based on this taxonomy of common house pets:

Image for post

The taxonomy of common house pets. My neighbor just adopted the cutest baby Pegasus.

Oh So Many Metrics

So we’ve got a whole ensemble of hierarchically-structured local classifiers, ready to do our bidding. How do we evaluate them?

That is not a trivial problem, and the solution is not obvious. As we’ve seen in previous problems in this series, different projects require different treatment. The best metric could differ depending on the specific requirements and limitations of your project.

All in all, there are three main options to choose from. Let’s introduce them, shall we?

The contestants, in all their grace and glory:

#machine-learning #hierarchical #performance-metrics #ensemble-learning #metrics

5 Regression algorithms: Explanation & Implementation in Python

Take your current understanding and skills on machine learning algorithms to the next level with this article. What is regression analysis in simple words? How is it applied in practice for real-world problems? And what is the possible snippet of codes in Python you can use for implementation regression algorithms for various objectives? Let’s forget about boring learning stuff and talk about science and the way it works.

#linear-regression-python #linear-regression #multivariate-regression #regression #python-programming

Angela  Dickens

Angela Dickens


Regression: Linear Regression

Machine learning algorithms are not your regular algorithms that we may be used to because they are often described by a combination of some complex statistics and mathematics. Since it is very important to understand the background of any algorithm you want to implement, this could pose a challenge to people with a non-mathematical background as the maths can sap your motivation by slowing you down.

Image for post

In this article, we would be discussing linear and logistic regression and some regression techniques assuming we all have heard or even learnt about the Linear model in Mathematics class at high school. Hopefully, at the end of the article, the concept would be clearer.

**Regression Analysis **is a statistical process for estimating the relationships between the dependent variables (say Y) and one or more independent variables or predictors (X). It explains the changes in the dependent variables with respect to changes in select predictors. Some major uses for regression analysis are in determining the strength of predictors, forecasting an effect, and trend forecasting. It finds the significant relationship between variables and the impact of predictors on dependent variables. In regression, we fit a curve/line (regression/best fit line) to the data points, such that the differences between the distances of data points from the curve/line are minimized.

#regression #machine-learning #beginner #logistic-regression #linear-regression #deep learning

Elton  Bogan

Elton Bogan


Polynomial Regression — The “curves” of a linear model

The most glamorous part of a data analytics project/report is, as many would agree, the one where the Machine Learning algorithms do their magic using the data. However, one of the most overlooked part of the process is the preprocessing of data.

A lot more significant effort is put into preparing the data to fit a model on rather than tuning the model to fit the data better. One such preprocessing technique that we intend to disentangle is Polynomial Regression.

#data-science #machine-learning #polynomial-regression #regression #linear-regression

Arne  Denesik

Arne Denesik


Diagnose the Generalized Linear Models

Generalized Linear Model (GLM) is popular because it can deal with a wide range of data with different response variable types (such as binomial_, Poisson, or _multinomial).Comparing to the non-linear models, such as the neural networks or tree-based models, the linear models may not be that powerful in terms of prediction. But the easiness in interpretation makes it still attractive, especially when we need to understand how each of the predictors is influencing the outcome.The shortcomings of GLM are as obvious as its advantages. The linear relationship may not always hold and it is really sensitive to outliers. Therefore, it’s not wise to fit a GLM without diagnosing.In this post, I am going to briefly talk about how to diagnose a generalized linear model. The implementation will be shown in R codes.There are mainly two types of diagnostic methods. One is outliers detection, and the other one is model assumptions checking.


Before diving into the diagnoses, we need to be familiar with several types of residuals because we will use them throughout the post. In the Gaussian linear model, the concept of residual is very straight forward which basically describes the difference between the predicted value (by the fitted model) and the data.

Image for post

Response residuals

In the GLM, it is called “response” residuals, which is just a notation to be differentiated from other types of residuals.The variance of the response is no more constant in GLM, which leads us to make some modifications to the residuals.If we rescale the response residual by the standard error of the estimates, it becomes the Pearson residual.

#data-science #linear-models #model #regression #r