DiscoBERT: A BERT that shortens your reading time

DiscoBERT: A BERT that shortens your reading time

A Discourse Aware BERT that improves summary generation. With the world being so connected now, we’re constantly being bombarded with information from a variety of different sources and this can be overwhelming.

With the world being so connected now, we’re constantly being bombarded with information from a variety of different sources and this can be overwhelming. Social media has also transformed the way information is presented to us, with apps like Instagram and Pinterest focusing mostly on visual content over text. This makes it less appealing to read large swathes of text when we have to.

But what if those large swathes of text could be converted into a summary of just the key points? In comes machine learning to the rescue again.

Summarization has been a task of interest to the Natural Language Processing community for many years now and there are 2 broad approaches used: abstractive and extractive. The abstractive method focuses on understanding the entire text and generating a summary one word at a time. Extractive summarization, on the other hand, focuses on selecting important information (typically sentences) from the text and stitching them together to form a summary.

Abstractive summaries are often more concise than extractive models but may be grammatically incorrect because generating words based on probabilities isn’t the most intuitive way to write summaries. However, both these approaches suffer from a common problem. They can’t keep track of long-range context in a document. For example, if a news article talked about Leo Messi in the 1st paragraph and mentioned him again in the 4th, both these approaches would struggle to link the contexts of those mentions. For abstractive models (which are mostly Seq2Seq models), this is because Seq2Seq models give more importance to more recent words than the others while extractive models are generally meant for a few sentences or a paragraph rather than a whole document.

The researchers at Microsoft Dynamics 365 AI Research aim to improve these summarization techniques by proposing their own model, Discourse-Aware BERT for Text Extraction **(DiscoBERT). The name gives away the fact that it’s an extractive model, but with a twist. Unlike other extractive models, DiscoBERT extracts **Elementary Discourse Units (EDUs, which are parts of sentences), rather than sentences for its summary. This helps the generated summary be concise because a whole sentence may contain some irrelevant information which don’t exist in EDUs.

naturallanguageprocessing bert summarization using-pretrained-model machine-learning

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Hire Machine Learning Developers in India

We supply you with world class machine learning experts / ML Developers with years of domain experience who can add more value to your business.

Applications of machine learning in different industry domains

We supply you with world class machine learning experts / ML Developers with years of domain experience who can add more value to your business.

Hire Machine Learning Developer | Hire ML Experts in India

We supply you with world class machine learning experts / ML Developers with years of domain experience who can add more value to your business.

What is Supervised Machine Learning

What is neuron analysis of a machine? Learn machine learning by designing Robotics algorithm. Click here for best machine learning course models with AI

How To Get Started With Machine Learning With The Right Mindset

You got intrigued by the machine learning world and wanted to get started as soon as possible, read all the articles, watched all the videos, but still isn’t sure about where to start, welcome to the club.