In this tutorial, I will explain how to calculate the sentiment of a book through a Supervised Learning technique, based on Support Vector Machines (SVM).

This tutorial calculates the sentiment analysis of the Saint Augustine Confessions, which can be downloaded from the Gutenberg Project Page. The masterpiece is split in 13 books (chapters). We have stored each book into a different file, named number.text (e.g. 1.txt and 2.txt). Each line of every file contains just one sentence.

Getting Started

Supervised Learning needs some annotated text to train the model. Thus, the first step consists in reading the annotations file and store it into a dataframe. The annotation file contains for each sentence, the associated score, which is a positive, negative or null number.

import pandas as pd

df = pd.read_csv('sources/annotations.csv')
df

Image for post

#data-science #support-vector-machine #supervised-learning #sentiment-analysis #machine-learning

Sentiment Analysis of a book through Supervised Learning
1.20 GEEK