In this tutorial, we’ll be exploring what sentiment analysis is, why it’s useful, and building a simple program in Node.js that analyzes the sentiment of Reddit comments.
Sentiment analysis is the process of extracting key phrases and words from text to understand the author’s attitude and emotions. So, why is it useful? Companies can use it to make more informed marketing decisions. For example, they can analyze product reviews, feedback, and social media to track their reputation. Additionally, social networks can use sentiment analysis to weed out poor quality content.
There are two main approaches to sentiment detection: knowledge-based and statistical.
Knowledge-based approaches usually compare words in text to a defined list of negative and postive words. Finn Årup Nielsen from The University of Denmark published AFINN, a list of postive and negatives words, and a magnitude score of each on a scale between -5 and 5. For example, “gloom” has a score of -1, while “awful” has a score of -3. The score of all known words are added up to determine the overall sentiment of the text.
Statistical approaches make use of machine learning by analyzing known sentiments, and determining the unknown based on the knowns. For example, Amazon could create a machine learning model that analyzes the text and the 1 through 5 star rating of each product review. Then, they would be able to make an assumption about the star rating of a new review that doesn’t have a star rating yet.
With any approach, a score is typically given to each body of text that is analyzed. A negative score implies the text has a mostly negative attitude, and a positive score implies the text has a mostly positive attitude.
There can be some challenges in analyzing text. Because of this, sentiment analysis will never be completely accurate. Here’s a brief list of potential scenarios that can be tricky to analyze:
We’ll be making a Node.js app that calculates the sentiment of comments from a Reddit post asking how peoples’ days are going, and then displays the results in a webpage.
We’re going to be creating a Node.js app, so make sure you have it installed. Then:
cd ~/Desktop/folder
for example)npm init
to go through the creation wizardnpm install express ml-sentiment
Now that our dependencies are installed, let’s create and open a server.js
file in the folder you created.
var express = require("express"); var app = express();var ml = require(“ml-sentiment”)();
var redditComments = require(“./comments.json”);const listener = app.listen(3000, function() {
console.log("Your app is listening on port " + listener.address().port);
});
What does this file do right now? The first block sets up Express, a web server library. The second block tells the program to import our sentiment analysis library, and the JSON data file of the Reddit comments. The last block starts our server and tells us which port it is listening on. There is nothing for the server to show though, because we haven’t defined any “routes” for Express to use yet.
The Node library we’re using for sentiment analysis, ml-sentiment
, has documentation that tells us how we can use it:
var ml = require(“ml-sentiment”);
ml.classify(“Rainy day but still in a good mood”);
// returns 2 … (overall positive sentiment)
This library uses AFINN-111, which has the ratings of 2477 words and phrases. The library simply looks at the words in the parameter of the .classify
function, and compares each to AFINN-111. If a word like “not” or “don’t” precedes the word, it uses the absolute value of the score. For example, “anxious” has a score of -2, while “not anxious” has a score of 2.
This is by no means a comprehensive library, but it’s quick to implement, runs fast and works reliably on simple examples.
Let’s create a function that loops through all of the Reddit comments, uses the ml.classify
function to get a sentiment score, and saves that value into the redditComments
array.
redditComments.forEach(function(comment) {
comment.sentiment = ml.classify(comment.body);
if (comment.sentiment >= 5) {
comment.emoji = “😃”;
} else if (comment.sentiment > 0) {
comment.emoji = “🙂”;
} else if (comment.sentiment == 0) {
comment.emoji = “😐”;
} else {
comment.emoji = “😕”;
}
});
Now, our redditComments
variable is an array of objects with the link
, body
, author
, emoji
, and sentiment
keys. For example, here’s how one object in the array looks:
{
“link”: “https://reddit.com/r/AskReddit/comments/6szu5h/reddit_how_was_your_day/dlgtei6/”,
“body”: “It was so nice day. it was my memorable day. “,
“author”: “Gemma_Youl”,
“sentiment”: 3,
“emoji”: “🙂”
} …
Next, we’ll define two routes in Express that sends our redditComments
data in a webpage. Routes have to be defined after app
is defined, but before app.listen
is called.
app.get(”/”, function(req, res) {
res.sendFile(__dirname + “/index.html”);
});app.get(“/data”, function(req, res) {
res.json(redditComments);
});
This first route says that when the /
directory receives a GET request, Express should send the index.html
file. The second route says that when the /data
directory receives a GET request, Express should send a JSON response of the redditComments
variable.
Here’s how the server.js
file looks now:
var express = require(“express”);
var app = express();var ml = require(“ml-sentiment”)();
var redditComments = require(“./comments.json”);redditComments.forEach(function(comment) {
comment.sentiment = ml.classify(comment.body);
if (comment.sentiment >= 5) {
comment.emoji = “😃”;
} else if (comment.sentiment > 0) {
comment.emoji = “🙂”;
} else if (comment.sentiment == 0) {
comment.emoji = “😐”;
} else {
comment.emoji = “😕”;
}
});app.get(“/”, function(req, res) {
res.sendFile(__dirname + “/index.html”);
});app.get(“/data”, function(req, res) {
res.json(redditComments);
});const listener = app.listen(process.env.PORT, function() {
console.log("Your app is listening on port " + listener.address().port);
});
It doesn’t work just yet! We haven’t created the index.html
file yet. Make a new file called index.html
. Code this into the file:
<head>
<link href=“https://cdnjs.cloudflare.com/ajax/libs/bulma/0.7.4/css/bulma.min.css” rel=“stylesheet” />
<style>
#main {
margin: 2rem;
}.big { font-size: 1.2rem; }
</style>
</head><body>
<section class=“hero is-success”>
<div class=“hero-body”>
<div class=“container”>
<h1 class=“title”>How was your day?</h1>
<h2 class=“subtitle”>Sentiment analysis demo</h2>
</div>
</div>
</section>
<div id=“main”>
<table class=“table is-fullwidth”>
<thead>
<tr>
<th>Feeling</th>
<th>Score</th>
<th>Author</th>
<th>Comment</th>
</tr>
</thead>
<tbody id=“sentimentTable”>
</tbody>
</table>
</div>
<script>
var request = new XMLHttpRequest();
request.open(‘GET’, ‘/data’, true);request.onload = function() { if (request.status >= 200 && request.status < 400) { var table = document.getElementById("sentimentTable") var data = JSON.parse(request.responseText); data.forEach(function(comment){ var newRow = table.insertRow(table.rows.length); newRow.insertCell(0).innerHTML = comment.emoji newRow.insertCell(1).innerHTML = comment.sentiment var rowLink = document.createElement('a') rowLink.innerHTML = comment.author rowLink.href = comment.link newRow.insertCell(2).appendChild(rowLink) newRow.insertCell(3).innerHTML = comment.body }) } else { alert("Could not retrieve data") } }; request.onerror = function() { alert("Could not retrieve data") }; request.send();
</script>
</body>
How does this work? In the HTML page, a script is defined that sends a web request to /data
, and creates a new row in a table for each sentiment we analyzed.
Everything is good to go! To run your program, go back to the terminal and run node server.js
. Make sure you are still in your project’s directory. Now, go to your browser and open localhost:3000
. You should see our new webpage with the sentiment of each Reddit comment!
Notice how some comments have negations, like “not bad”, and the sentiment has a postive value. This is because the sentiment library we used has basic support for negation.
Thanks for reading ❤
If you liked this post, share it with all of your programming buddies!
Follow us on Facebook | Twitter
☞ The Complete Node.js Developer Course (3rd Edition)
☞ Angular & NodeJS - The MEAN Stack Guide
☞ NodeJS - The Complete Guide (incl. MVC, REST APIs, GraphQL)
☞ Node.js: The Complete Guide to Build RESTful APIs (2018)
☞ MERN Stack Front To Back: Full Stack React, Redux & Node.js
☞ Front-end Developer Handbook 2019
☞ Best Practices For Using TypeScript with Node.js
☞ Creating a RESTful Web API with Node.js and Express.js from scratch
☞ Node, Express, React.js, Graphql and MongoDB CRUD Web Application
☞ Restful API with NodeJS, Express, PostgreSQL, Sequelize, Travis, Mocha, Coveralls and Code Climate
☞ A Beginner’s Guide to npm — the Node Package Manager
☞ Building REST API with Nodejs / MongoDB /Passport /JWT
Originally published on https://enlight.nyc
#node-js