Exploratory Data Analysis (EDA) — Don’t ask how, ask what

This is part 1 in a series of articles guiding the reader through an entire data science project.

What is EDA anyway?

EDA or Exploratory Data Analysis is the process of understanding what data we have in our dataset before we start finding solutions to our problem. In other words — it is the act of analyzing the data without biased assumptions in order to effectively preprocess the dataset for modeling.

Why do we do EDA?

The main reasons we do EDA are to verify the data in the dataset, to check if the data makes sense in the context of the problem, and even sometimes just to learn about the problem we are exploring. Remember:

