Learning the basics of Exploratory Data Analysis (EDA) using Python with Numpy, Matplotlib, and Pandas. EDA in Python uses data visualization to draw meaningful patterns and insights. EDA is an approach of analyzing datasets to summarize their main characteristics, often with visual methods.

“In statistics,exploratory data analysis(EDA) is an approach of analyzing datasets to summarize their main characteristics, often with visual methods.” — Wikipedia

There are a number of tools that are useful for EDA, but EDA is characterized more by the attitude taken than by particular techniques. Typical graphical techniques used in EDA are:

- Box plot
- Histogram
- Multi-vari chart
- Run chart
- Pareto chart
- Scatter plot
- Stem-and-leaf plot
- Parallel coordinates
- Odds ratio
- Targeted projection pursuit
- Glyph-based visualization methods such as PhenoPlot and Chernoff faces
- Projection methods such as grand tour, guided tour and manual tour
- Interactive versions of these plots

Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally close to its intrinsic dimension. Working in high-dimensional spaces can be undesirable for many reasons; raw data are often sparse as a consequence of the curse of dimensionality, and analyzing the data is usually computationally intractable.

- Multidimensional scaling
- Principal component analysis (PCA)
- Multilinear PCA
- Nonlinear dimensionality reduction (NLDR)

Typical quantitative techniques are:

**In Data Analysis, we will analyze to find out the following:**

- Dataset’s shape and overview
- Missing values
- All numerical variables
- Distribution of the numerical variables
- Outliers
- Categorical variables
- Cardinality of categorical variables
- Relationship between independent and dependent feature (We will plot and check distributions in each section).

python data-science matplotlib data-analysis data-visualization

🔵 Intellipaat Data Science with Python course: https://intellipaat.com/python-for-data-science-training/In this Data Science With Python Training video, you...

Data visualization is the graphical representation of data in a graph, chart or other visual formats. It shows relationships of the data with images.

So here is my first blog regarding the data visualization with matplotlib in python. In this article we will cover the basic of the visualization with matplotlib.

I work on strategic questions and provide actionable, data-driven insights to inform product and engineering decisions. In this article, I’ll use Python to explore and visualize the classic titanic data.

In Conversation With Dr Suman Sanyal, NIIT University,he shares his insights on how universities can contribute to this highly promising sector and what aspirants can do to build a successful data science career.