“In statistics, exploratory data analysis (EDA) is an approach of analyzing datasets to summarize their main characteristics, often with visual methods.” — Wikipedia

Graphical techniques

There are a number of tools that are useful for EDA, but EDA is characterized more by the attitude taken than by particular techniques. Typical graphical techniques used in EDA are:

Dimensionality reduction:

Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally close to its intrinsic dimension. Working in high-dimensional spaces can be undesirable for many reasons; raw data are often sparse as a consequence of the curse of dimensionality, and analyzing the data is usually computationally intractable.

Typical quantitative techniques are:

In Data Analysis, we will analyze to find out the following:

  1. Dataset’s shape and overview
  2. Missing values
  3. All numerical variables
  4. Distribution of the numerical variables
  5. Outliers
  6. Categorical variables
  7. Cardinality of categorical variables
  8. Relationship between independent and dependent feature (We will plot and check distributions in each section).

#python #data-science #matplotlib #data-analysis #data-visualization

Exploratory Data Analysis (EDA) with Python & Matplotlib
3.20 GEEK