Python: How to Handle Missing Data in Pandas DataFrame

Python: How to Handle Missing Data in Pandas DataFrame

Pandas is a Python library for data analysis and manipulation. Almost all operations in pandas revolve around DataFrames, an abstract data structure tailor-made for handling a metric ton of data.

Pandas is a Python library for data analysis and manipulation. Almost all operations in pandas revolve around DataFrames, an abstract data structure tailor-made for handling a metric ton of data.

In the aforementioned metric ton of data, some of it is bound to be missing for various reasons. Resulting in a missing (null/None/Nan) value in our DataFrame.

Which is why, in this article, we'll be discussing how to handle missing data in a Pandas DataFrame.

Data Inspection

Real-world datasets are rarely perfect. They may contain missing values, wrong data types, unreadable characters, erroneous lines, etc.

The first step to to any proper data analysis is cleaning and organizing the data we'll later be using. We will discuss a few common problems related to data that might occur in a dataset.

We will be working with small employees dataset for this. The .csv file looks like this:

First Name,Gender,Salary,Bonus %,Senior Management,Team
Douglas,Male,97308,6.945,TRUE,Marketing
Thomas,Male,61933,NaN,TRUE,
Jerry,Male,NA,9.34,TRUE,Finance
Dennis,n.a.,115163,10.125,FALSE,Legal
,Female,0,11.598,,Finance
Angela,,,18.523,TRUE,Engineering
Shawn,Male,111737,6.414,FALSE,na
Rachel,Female,142032,12.599,FALSE,Business Development
Linda,Female,57427,9.557,TRUE,Client Services
Stephanie,Female,36844,5.574,TRUE,Business Development
,,,,,

Let's import it into a DataFrame:

df = pd.read_csv('out.csv')
df

python pandas tool

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Python Pandas Objects - Pandas Series and Pandas Dataframe

In this post, we will learn about pandas’ data structures/objects. Pandas provide two type of data structures:- ### Pandas Series Pandas Series is a one dimensional indexed data, which can hold datatypes like integer, string, boolean, float...

Pandas in Python

Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.

Python Tricks Every Developer Should Know

In this tutorial, you’re going to learn a variety of Python tricks that you can use to write your Python code in a more readable and efficient way like a pro.

How to Remove all Duplicate Files on your Drive via Python

Today you're going to learn how to use Python programming in a way that can ultimately save a lot of space on your drive by removing all the duplicates. We gonna use Python OS remove( ) method to remove the duplicates on our drive. Well, that's simple you just call remove ( ) with a parameter of the name of the file you wanna remove done.

Basic Data Types in Python | Python Web Development For Beginners

In the programming world, Data types play an important role. Each Variable is stored in different data types and responsible for various functions. Python had two different objects, and They are mutable and immutable objects.