Recurrent Neural Networks for Multilabel Text Classification Tasks

Recurrent Neural Networks for Multilabel Text Classification Tasks

The purpose of this project is to build and evaluate Recurrent Neural Networks(RNNs) for sentence-level classification tasks. Let's understand about recurrent neural networks for multilabel text classification tasks.

The purpose of this project is to build and evaluate Recurrent Neural Networks(RNNs) for sentence-level classification tasks. I evaluate three architectures: a two-layer Long Short-Term Memory Network(LSTM), a two-layer Bidirectional Long Short-Term Memory Network(BiLSTM), and a two-layer BiLSTM with a word-level attention layer. Although they do learn useful vector representation, BiLSTM with attention mechanism focuses on necessary tokens when learning text representation. To that end, I’m using the 2019 Google Jigsaw published dataset on Kaggle labeled “Jigsaw Unintended Bias in Toxicity Classification.” The dataset includes 1,804,874 user comments, with the toxicity level being between 0 and 1. The final models can be used for filtering online posts and comments, social media policing, and user education.

Links

Recurrent Neural Networks Overview

RNNs are neural networks used for problems that require sequential data processing. For instance:

  • In a sentiment analysis task, a text’s sentiment can be inferred from a sequence of words or characters.
  • In a stock prediction task, current stock prices can be inferred from a sequence of past stock prices.

At each time step t _of the input sequence, RNNs compute the output _yt and an internal state update_ ht_ using the input xt and the previous hidden-state ht-1. They then pass information about the current time step of the network to the next. The hidden-state ht summarizes the task-relevant aspect of the past sequence of the input up to t, allowing for information to persist over time.

Image for post

Recurrent Neural Network

Image for post

Recurrent Neural Network

During training, RNNs re-use the same weight matrices at each time step. Parameter sharing enables the network to generalize to different sequence lengths. The total loss is a sum of all losses at each time step, the gradients with respect to the weights are the sum of the gradients at each time step, and the parameters are updated to minimize the loss function.

Image for post

forward pass: compute the loss function

Image for post

Image for post

loss function

Image for post

Backward Pass: compute the gradients

Image for post

gradient equation

Although RNNs learn contextual representations of sequential data, they suffer from the exploding and vanishing gradient phenomena in long sequences. These problems occur due to the multiplicative gradient that can exponentially increase or decrease through time. RNNs commonly use three activation functions: RELU, Tanh, and Sigmoid. Because the gradient calculation also involves the gradient with respect to the non-linear activations, architectures that use a RELU activation can suffer from the exploding gradient problem. Architectures that use Tanh/Sigmoid can suffer from the vanishing gradient problem. Gradient clipping — limiting the gradient within a specific range — can be used to remedy the exploding gradient. However, for the vanishing gradient problem, a more complex recurrent unit with gates such as Gated Recurrent Unit (GRU) or Long Short-Term Memory (LSTM) can be used.

ai recurrent-neural-network attention-network machine-learning neural-network

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Artificial Neural Networks — Recurrent Neural Networks

Artificial Neural Networks — Recurrent Neural Networks. Remembering the history and predicting the future with neural networks. A intuition behind Recurrent neural networks.

Hire Machine Learning Developers in India

We supply you with world class machine learning experts / ML Developers with years of domain experience who can add more value to your business.

Learn Machine Learning with Python (Part 3) | Machine Learning with Neural Networks

Learn Machine Learning with Python using neural networks with this machine learning beginners course. In this tutorial we will look at taking an existing sol...

Applications of machine learning in different industry domains

We supply you with world class machine learning experts / ML Developers with years of domain experience who can add more value to your business.

Hire Machine Learning Developer | Hire ML Experts in India

We supply you with world class machine learning experts / ML Developers with years of domain experience who can add more value to your business.