Business Understanding

For all AirBnB users and hosts in Seattle, I will analyze and answer business-related questions in these aspects:

  • Price Analysis
  • Listings count Analysis
  • Busiest time Analysis
  • Occupancy rate and Reviews Analysis
  • Modeling for Price Prediction

Questions and answers are covered below.


Data Understanding

Here I will perform Exploratory Data Analysis on the data provided by Inside Airbnb on Kaggle, you can download the data from here(zip file), Zip file contains 3 csv files: listing.csvcalendar.csv, and reviews.csv

Overview of listing.csv

Read the csv file using pandas as given below:

#read listing.csv, and its shape
listing_seattle = pd.read_csv(‘listings_seattle.csv’)
print(‘Shape of listing csv is’,listing_seattle.shape)
listing_seattle.sample(5)    #display 5 rows at random

Basic checks and high-level data analysis

Have a look at the data and have some sanity checks like the percentage of missing values per column, are the listing_ids unique throughout the dataset?, examine the summary of numerical columns, etc.

  • Percentage of missing values in each column

Percentage of missing values per column

From the above bar chart, we get the important columns with the least missing values. Columns like license and square****feet have more than 95% of the data missing, hence we will drop these columns.

Are the ids unique for each row?
len(listing_seattle['id'].unique()) == len(listing_seattle)

Description of all numeric features
listing_seattle.describe()

#data-science #machine-learning #business-analysis #data-visualization #data-analysis #data analysis

Analysis, Price Modeling and Prediction: AirBnB Data for Seattle.
7.95 GEEK