Live Summarization on Twitter Data

Live Summarization on Twitter Data

All that glitters is not gold sometimes it is concise too! Be it a book review, movie review, or the stats presented in the office meeting, short and to the point data interests the audience.

Why Summary ?

All that glitters is not gold sometimes it is concise too! Be it a book review, movie review, or the stats presented in the office meeting, short and to the point data interests the audience.

The summary of the trending topics on micro-blogging site twitter will be helpful for the people to know what’s going around in just a minute of reading. So yes the idea of the problem undertaken was conceived like this only.

What is Automatic Text Summarization ?

The task of automatic text summarization involves the use of Natural Language Processing techniques to generate a summary of the given data. The automatic text summarization techniques are further divided into two categories viz. Extractive Summarization & Abstractive Summarization.

Extractive Summarization approaches rank the sentences in a text on the basis of the similarity they share with the text. The top-ranked sentences are returned as summary. These are fast in terms of retrieval time but precision values may fall in certain cases of multi-domain summary generation.

Abstractive Summarization approaches are different in a way that they try to find out the semantic context and embedded meaning of the words and sentences. It involves heavy usage of NLP and supervised learning techniques thus it is slow in comparison to extractive summarization but precision wise it performs well.

How did we do it?

The data on twitter is the type of single domain multi-document _text data i.e several tweets collectively represent the opinion on a single topic in trend. We scraped the tweets out of twitter on 20 different topics such as _“Coronovirus”, “Lockdown”, “Olympics” _and “Cryptocurrency” using the twitter streaming API _Tweepy.

In order to generate the _ground truth _of tweet summaries we picked the summaries out of the text after multiple reading of the tweets in steps and removed the redundant portion out of the summary generated.

A commonly used algorithm for extractive text summarization is the Textrank algorithm based on the google Pagerank algorithm for links on the web. In Textrank sentences are first weighted on the basis of certain similarity metric. Cosine similarity between the sentences is generally used to give the initial weights to the sentences and a directed graph is constructed of the sentences cluster where each sentence forms the node of the graph while edges are links between the sentences on the basis of similarity between them.

automatic-summarization nlp streaming twitter information-retrieval data visualization

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Visual Analytics and Advanced Data Visualization

Visual Analytics and Advanced Data Visualization - How CanvasJS help enterprises in creating custom Interactive and Analytical Dashboards for advanced visual analytics for data visualization

Twitter Data Visualization Using R

Twitter Data Visualization Using R. In this post I want to present a small case study where I analyze Twitter text data. Data exploration aims to get any information and insight from Twitter data.

Visualization Best Practices for Data Scientists

Visualization Best Practices for Data Scientists. Disclaimer: The ideas presented in this article are from the book: Story Telling With Data by Cole Nussbaumer Knaflic.

Spark Structured Streaming – Stateful Streaming

Spark Structured Streaming – Stateful Streaming. Welcome back folks to this blog series of Spark Structured Streaming. This blog is the continuation of the earlier blog "Internals of Structured Streaming".

Applications Of Data Science On 3D Imagery Data

The agenda of the talk included an introduction to 3D data, its applications and case studies, 3D data alignment and more.