Individuals working in the field of Data Science understand the importance of data. Data is the resource to fuel a machine learning model. But raw data in the real world cannot be used without pre-processing them to a usable format. One of the most common problems faced with real-time data is missing values. There are some values in rows and columns that simply do not exist. But, for a good model training, we need the data to be as clean as possible.

Missing values are generally represented with NaN which stands for Not a Number. Although Pandas library provides methods to impute values to these missing rows and columns, we need to be able to understand how, where and how many points of NaN are distributed in the dataset. For this, python introduced a new library called Missingno.

The purpose of this article is to get a better understanding of missing data by visualizing them using Missingno.

#developers corner #missing value dataset #missing values #missingno #python

Tutorial On Missingno - Python Tool To Visualize Missing Values
2.15 GEEK