ELMo: Why it’s one of the biggest advancements in NLP. Embeddings from Language Models (ELMo) is a state-of-the-art language modeling idea. What makes it so successful?

Published in 2018, “Deep Contextualized Word Embeddings” presented the idea of Embeddings from Language Models (ELMo), which achieved state-of-the-art performance on many popular tasks including question-answering, sentiment analysis, and named-entity extraction. ELMo has been shown to yield performance improvements of up to almost 5%. But what makes this idea so revolutionary?

*What’s ELMo? *Not only is he a Muppet, but ELMo is also a powerful computational model that converts words into numbers. This vital process allows machine learning models (which take in numbers, not words, as inputs) to be trained on textual data.

Why is ELMo so good? There are a few primary points that stood out to me when I read through the original paper:

  1. ELMo accounts for a word’s context.
  2. ELMo is trained on a large text corpus.
  3. ELMo is open-source.

Let’s go through each of these points in detail and talk about why they’re important.

