We are two Singaporean undergraduate students passionate about Data Science.

With the upcoming Singapore General Elections looming, we decided to take this opportunity to employ our Data Science skills to do an analysis in this area, as of 28th June 2020. **This is in no way meant to influence the elections, nor do they imply anything about anyone or any party involved. **This is just some good, clean, data science fun.


Introduction

The problem statements we attempted to investigate were:

  1. What was the public sentiment distribution across the various parties?
  2. Was there a statistically significant difference in sentiment expressed in newspapers compared to the public sentiment?

Image for post

Stacked Bar Graph showing differing sentiments towards each party on Twitter

We used a Twitter Sentiment Analysis as a way to quantify public sentiment, and a separate Sentiment Analysis performed on The Straits Times. Additionally, we could compare sentiment between parties on both platforms.

Through our work, we were able to see distinct differences in positive and negative sentiment towards each of the parties, though this was all statistically insignificant at a 0.05 alpha value.

At the alpha value of 0.05, we found a statistically significant difference between Subjectivity overall, and a difference in Subjectivity towards the PAP.

Methodology

Using python:

  1. We performed a series of web scrapings targeting Twitter and The Straits Times.
  2. Conducted a sentiment analysis on each platform using the Textblob library
  3. Visualized the findings through matplotlib and wordclouds.
  4. Conducted a simple hypothesis test to evaluate differences in subjectivity and polarity.

Assumptions & Limitations

#data-science #nlp #web-scraping #data-analysis #python

Pre-election Python Sentiment Analysis
1.25 GEEK