Using VADER and BERT, I analyse the sentiments of Tweets pertaining to Singapore’s ruling party in the run-up to the 2020 General Elections.
A little over a month ago on 10 July, Singapore held its elections to elect members of the 14th Parliament of Singapore. What do you do when you’re really excited as a first-time voter who has a lot of spare time on her hands? You conduct a quick study to analyse the sentiments of Tweets and see if they reflect the actual results from the election. Okay, I guess I might be the only one who thinks in this manner (nerd alert), but anyway, let’s just dive straight into it!
Using the Tweepy API, and with the help of the code used by Griffin in this article, I downloaded tweets using ‘PAP #GE2020’ as the search term.
PAP stands for the People’s Action Party, which is Singapore’s ruling party. The hashtag GE2020 is used by most people who tweeted about the 2020 Singapore general elections.
I deliberated quite a bit over what the appropriate search term was — simply using #GE2020 wouldn’t be quite right, as the tweets collected would also include those reflecting public sentiments towards opposition parties. Although the search term that I used would exclude tweets that did not mention PAP or use the hashtag GE2020 but were, in fact, talking about the ruling party, I felt that it was the closest that I could get to isolating the tweets reflecting the sentiments towards the ruling party.
I chose to include retweets as well, as I figured that Twitter users tend to retweet tweets that they resonated with. My dataset included tweets and retweets posted in between 6 July to 8 July, where the online political discourse was likely to be the most active since polling day (10 July) was coming up. By the way, you might be wondering why 9 July was excluded, it’s because that day is cooling-off day, where there is a prohibition of campaigning activities so as to allow voters to take a step back and reflect on issues before heading to the polls the following day.
I originally hoped to collect 50000 tweets and retweets but ended up getting a lot of duplicated data, probably because there aren’t that many tweets that fulfilled the criteria of my search term over a short span of 3 days (I also forgot how Singapore’s citizen population of around 3.5 million isn’t that large to begin with). My final dataset consisted of 2504 tweets and retweets, and 406 unique tweets.
Now, you may be curious about what are some of the commonly used words in these tweets. Let’s create word clouds to find out!
Data science is omnipresent to advanced statistical and machine learning methods. For whatever length of time that there is data to analyse, the need to investigate is obvious.
Tableau Data Analysis Tips and Tricks. Master the one of the most powerful data analytics tool with some handy shortcut and tricks.
Analysis, Price Modeling and Prediction: AirBnB Data for Seattle. A detailed overview of AirBnB’s Seattle data analysis using Data Engineering & Machine Learning techniques.
DISCLAIMER: absolutely subjective point of view, for the official definition check out vocabularies or Wikipedia. And come on, you wouldn’t read an entire article just to get the definition.
Suppose you are looking to book a flight ticket for a trip of yours. Now, you will not go directly to a specific site and book the first ticket that you see.