Generating text with Recurrent Neural Networks based on the work of F. Pessoa. Deep Learning application using Tensorflow and Keras.
Sequences of discrete tokens can be found in many applications, namely words in a text, notes in a musical composition, pixels in an image, actions in a reinforcement learning agent, etc . These sequences often show a strong correlation between consecutive or nearby tokens. The correlations on words in a sentence or characters in words express the underlying semantics and language characteristics. The next token in the sequence xn_ can be modeled as:
where xi_ represents the i_th token in the sequence. In Natural Language Processing (NLP), these are defined as language models. Usually, each token stands for a separate word or _n-gram. The output generated is a probability distribution from which we can sample to generate the next token in the sequence. These models are also known as recurrent, as we can apply this generative process recurrently to create entire new sequences of tokens.
One particular type of generative model often used to tackle problems with sequences of discrete tokens is Recurrent Neural Networks (RNN). In a simpler neural network, a fixed-dimensional feature representation is transformed several times by different non-linear functions. In an RNN, these transformations are also repeated in time, which means that at every time step, a new input is processed, and a new output is generated. They can effectively capture semantically rich representations of the input sequences . RNN showed this capacity in different settings, such as generating structured text, original images (on a per pixels basis), or even modeling user behavior on online services.
Our task is to generate original text that resembles a training corpus. It is an unsupervised task, as we do not have access to any labeling or target variable. We start by creating a word embedding that maps each character to a vector with a parameterized dimension. For each character, the model looks up the embedding and feeds the result to a stack of Long Short-Term Memory (LSTM) layers, a specific type of RNN. These were developed to extend the traditional capacity of RNNs to model long-term dependencies and counter the vanishing gradient problem. The output of our network is a dense layer with a number of units equal to the vocabulary size. We did not define an activation function for this layer; it simply outputs one logit for each character in the vocabulary. We use these values to later sample from a categorical distribution.
In this article, we use the work of Fernando Pessoa, one of the most significant literary figures of the 20th century and one of the greatest poets in the Portuguese language. This dataset is now publicly available on Kaggle and consists of more than 4300 poems, essays, and other writings .
Inexture's Deep learning Development Services helps companies to develop Data driven products and solutions. Hire our deep learning developers today to build application that learn and adapt with time.
In this tutorial, we’ll take a detailed look into the Deep Learning vs. NLP debate, understand their importance in the AI domain, see how they associate with one another, and learn about the differences between Deep Learning and NLP.
Introduction to Transfer Learning for NLP using fast.ai. This is the third part of a series of posts showing the improvements in NLP modeling approaches.
Looking to attend an AI event or two this year? Below ... Here are the top 22 machine learning conferences in 2020: ... Start Date: June 10th, 2020 ... Join more than 400 other data-heads in 2020 and propel your career forward. ... They feature 30+ data science sessions crafted to bring specialists in different ...
NLP: Preparing text for deep learning model using TensorFlow2. How text pre-processing (tokenization, sequencing, padding) in TensorFlow2 works.