Transformer-XL Review: Beyond Fixed-Length Contexts

Transformer-XL Review: Beyond Fixed-Length Contexts

Transformer-XL Review: Beyond Fixed-Length Contexts. Review of “Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context”

This paper (“Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context”) was published in ACL 2019, one of the top NLP conferences, by researchers at Google AI. It proposes Transformer-XL, a new architecture that enables natural language understanding beyond a fixed-length context without disrupting temporal coherence. Its key innovations are a segment-level recurrence mechanism and a novel positional encoding scheme. Unlike the traditional Transformer model, it can capture longer-term dependency and solve the context fragmentation problem, which are the main limitations of the vanilla Transformer. The experiments show that Transformer-XL learns dependency that is much longer than RNNs and vanilla Transformer. Transformer-XL also achieves state-of-the-art results in the evaluation with large benchmark datasets.

Paper link: https://www.aclweb.org/anthology/P19-1285.pdf

1. Background

Language modeling is an important topic in natural language processing. People have proposed many unsupervised pre-training methods like BERT and ELMo. However, modeling long-term dependency remains a challenge. Recurrent neural networks (RNNs), especially Long Short-term Memory networks (LSTM) have been a standard solution to modeling long-term dependency. The introduction of gating in LSTMs and the gradient clipping technique improve the ability of modeling long-term dependency, but it is insufficient to address this challenge. Also, it is difficult to optimize RNNs for modeling long-term dependency due to gradient vanishing and explosion.

data-science machine-learning artificial-intelligence nlp

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Most popular Data Science and Machine Learning courses — July 2020

Most popular Data Science and Machine Learning courses — August 2020. This list was last updated in August 2020 — and will be updated regularly so as to keep it relevant

Artificial Intelligence vs Machine Learning vs Data Science

Artificial Intelligence, Machine Learning, and Data Science are amongst a few terms that have become extremely popular amongst professionals in almost all the fields.

AI(Artificial Intelligence): The Business Benefits of Machine Learning

Enroll now at CETPA, the best Institute in India for Artificial Intelligence Online Training Course and Certification for students & working professionals & avail 50% instant discount.

Data science vs. Machine Learning vs. Artificial Intelligence

In this tutorial on "Data Science vs Machine Learning vs Artificial Intelligence," we are going to cover the whole relationship between them and how they are different from each other.

Comparison of Data Science Vs Machine Learning Vs Artificial Intelligence

Explore the differences between Data Science, Machine Learning, Artificial Intelligence. Understand how DS, ML, and AI is extremely inter-related. Choose the Right career path!