In Machine Learning, we often come across situations where we see outliers present in the data set. These outliers are nothing but extreme values present or we can say the values that do not follow the pattern in the data. The values that diverge from all other values are termed as outliers. These outliers can arise due to different factors like human error while preparing the data or internationally putting outliers in the data to test the model and many other different reasons. But are they beneficial for us while building predictive models? The answer is sometimes we have to drop these outliers and sometimes when we retain them as they hold some interesting meaning.

In this article, we will be discussing how we should detect outliers in the dataset and remove them using different ways. We will use a weight-height dataset that is available on Kaggle publicly. The data set contains weight and height values, we will search for outliers in the weight column.

What you will learn from this article?


  • What are Outliers? How to find them?
  • What are Z-score and Standard deviation?
  • How to remove Outliers using Z-score and Standard deviation?

#developers corner #outlier detection #outliers #z-score #python

Outlier Detection Using z-Score - A Complete Guide With Python Codes
6.65 GEEK