Have you ever had one too many reports to read and you just want a quick summary of each report? Were you ever in a situation where everybody just wanted to read a summary instead of a full-blown report?

Summarization has become a very helpful way of tackling the issue of data overburden in the 21st century. In this story, I will show you how you can create your personal text summarizer using Natural Language Processing (NLP) in Python.

Foreword: Personal text summarizer is not hard to create — a beginner can easily do it!

What is text summarization?

It’s basically a task to generate an accurate summary while maintaining key information and not losing overall meaning.

There are two general types of summarization:

  • Abstractive summary >> generate new sentences from original text.
  • Extractive summary >> recognize important sentences and create a summary using those sentences.

Which summarization method should I use, and why?

I use extractive summary because I can apply this method to many documents without having to do a lot of (daunting) machine learning model training tasks.

Besides that, extractive summarization gives better summary outcome than abstractive summary, because abstractive summarization has to generate new sentences from the original text, which is a more difficult method than a data-driven approach to extract important sentences.

How to create your own Text Summarizer?

We will use word histogram to rank the importance of sentences and, subsequently, create a summary. The benefit of doing this is that you don’t need to train your model to use it for your document.

#machine-learning #python #data-science #diy

Report is too long to read? Use NLP to create a summary
2.35 GEEK