It has been 3 awesome weeks since Istanbul Data Science Bootcamp has started and finally, the first projects’ time has arrived.

We have been asked to find a dataset that suits our goal and try to implement Exploratory Data Analysis(EDA) on it to extract instincts that help describe the business intended to focus on.

About The Dataset:

This is a Brazilian e-commerce public dataset of orders made at the Olist Store. The dataset has information about 100k orders from 2016 to 2018 made at multiple marketplaces in Brazil. Its features allow viewing an order from multiple dimensions: from order status, price, payment and freight performance to customer location, product attributes and finally reviews written by customers. We also released a geolocation dataset that relates Brazilian zip codes to lat/lng coordinates.

This is real commercial data, it has been anonymized, and references to the companies and partners in the review text have been replaced with the names of Game of Thrones great houses.

Exploring Before Starting:

Before I start making my hands dirty with the analysis 😋, I should look into the data, examine it, find features types, find missing values, and do some cleaning.

Image for post

Pandas Info()

as we can see using info() function provided from the pandas library, the number of missing data in the dataset is too low (Lucky Us😁). This is something we don’t encounter every day.

There are some features that contain DateTime data but have been interpreted as object type by pandas read_csv() which is something that needs to be fixed if we are going to use these features for further analysis.

#data-science #data-visualization #eda #python #data-analysis #data analysis

Brazilian E-Commerce Public(EDA)
8.95 GEEK