EDA is a way to understand what the data is all about. It is very important as it helps us to understand the outliers, relationship of features within the data with the help of graphs and plots.
EDA is a time taking process as we need to make visualizations between different features using libraries like Matplot, seaborn, etc.
There is a way to automate this process by a single line of code using the library Pandas Visual Analysis.
About Pandas Visual Analysis
- It is an open-source python library used for Exploratory Data Analysis.
- It creates an interactive user interface to visualize datasets in Jupyter Notebook.
- Visualizations created can be downloaded as images from the interface itself.
- It has a selection type that will help to visualize patterns with and without outliers.
- 2. Importing Dataset
- 3. EDA using Pandas Visual Analysis
Let’s understand the different sections in the user interface :
- Statistical Analysis: This section will show the statistical properties like Mean, Median, Mode, and Quantiles of all numerical features.
- Scatter Plot-It shows the Distribution between 2 different features with the help of a scatter plot. you can choose features to be plotted on the X and Y axis from the dropdown.
- Histogram-It shows the distribution between 2 Different features with the help of a Histogram.
#data-analysis #machine-learning #data-visualization #data-science #data analysis #exploratory data analysis