Supermarket Data Analysis with Pandas. A comprehensive practical guide for pandas.

Pandas is the most widely-used data analysis and manipulation library for Python. Its intuitive and versatile functions make the data analysis process efficient, simple, and easy to understand.

In this article, we will practice pandas on a supermarket sales dataset available on Kaggle. It contains sales data of different branches of a supermarket chain during a 3-month-period.

Letâ€™s start by importing the libraries and reading the dataset.

```
import numpy as np
import pandas as pd
df = pd.read_csv("/content/supermarket.xls",
parse_dates=['Date'])
```

We use the parse_dates parameter to store the date column with datetime data type so that we do not have to convert the data type later on. The datetime data type allows for using the functions under the dt accessor that are specific to dates and times.

There are some columns which are either redundant and irrelevant to our analysis.

- Invoice ID: A unique invoice identification number. Does not possess any information for analsis.
- Cogs: Product of the unit price and quantity columns.
- Gross margin percentage: Consists of a single value which is 4.761905.
- Gross income: Can be obtained from multiplying the total column with 0.04761905 (i.e. gross margin percentage).
- City: Perfectly correlated with the branch column. There is one branch in each city.

artificial-intelligence machine-learning data-science programming python supermarket data analysis with pandas

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.