Introduction

The Internet has revolutionized the way we buy products. In the retail e-commerce world of online marketplace, where experiencing products are not feasible. Also, in today’s retail marketing world, there are so many new products are emerging every day. Therefore, customers need to rely largely on product reviews to make up their minds for better decision making on purchase. However, searching and comparing text reviews can be frustrating for users. Hence we need better numerical ratings system based on the reviews which will make customers purchase decision with ease.During their decision making process, consumers want to find useful reviews as quickly as possible using rating system. Therefore, models able to predict the user rating from the text review are critically important. Getting an overall sense of a textual review could in turn improve consumer experience. Also, it can help businesses to increase sales, and improve the product by understanding customer’s needs.The amazon review dataset for electronics products were considered. The reviews and ratings given by the user to different products as well as reviews about user’s experience with the product(s) were also considered.

Image for post

Problem Statement

The goal is to develop a model to predict user rating, usefulness of review and recommend most similar items to users based on collaborative filtering.

Data Collection

The electronics dataset consists of reviews and product information from amazon were collected. This dataset includes reviews (ratings, text, helpfulness votes) and product metadata (descriptions, category information, price, brand, and image features).Product Complete Reviews dataThis dataset includes electronics product reviews such as ratings, text, helpfulness votes. This dataset was obtained from http://jmcauley.ucsd.edu/data/amazon/. The original data was in json format. The json was imported and decoded to convert json format to csv format. The sample dataset is shown below:

Image for post

Sample product reviews dataset

Each row corresponds to a customer review and includes the following variables:

Image for post

Product MetadataThis dataset includes electronics product metadata such as descriptions, category information, price, brand, and image features. This dataset was obtained from http://jmcauley.ucsd.edu/data/amazon/. The json was imported and decoded to convert json format to csv format. The sample product meta dataset is shown below:

Image for post

Sample product meta dataset

Each row corresponds to product and includes the following variables:

Image for post

#lemmatization #amazon #nlp #stop-word #data analysis

Sentiment Analysis and Product Recommendation
1.70 GEEK