Hands-on NLP Deep Learning Model Preparation in TensorFlow 2.X

Many state-of-the-art results in NLP problems are achieved by using DL (deep learning), and probably you want to use deep learning style to solve NLP problems as well. While there are a lot of materials discussing how to choose and train the “best” neural network architecture, like, an RNN, selecting and configuring a suitable neural network is just one part of solving a practical NLP problem. The other import part, but often being underestimated, is model preparation. NLP tasks usually require special data treatment in the model preparation stage. In other words, there is a lot of things to do before we can throw the data in the neural networks to train. Unfortunately, there are not many tutorials giving detailed guidance on model preparation.

Besides, the packages or APIs to support the state-of-the-art NLP theories and algorithms are usually released very recently and are updating at a rapid speed. (e.g., TensorFlow was first released in 2015, PyTorch in 2016, and spaCy in 2015.) To achieve a better performance, many times, you might have to integrate several packages in your deep learning pipeline, while preventing them from crashing with each other.

That’s why I decided to write this article to give you a detailed tutorial.

I will walk you through the model preparation pipelines from tokenizing raw data to configuring the Tensorflow Embedding so that your neural networks are ready for the training.
The example code will help you to have a solid understanding of the model preparation steps.
In the tutorial, I will choose the popular packages and APIs that specialize in NLP and advise for parameter default settings to make sure you will have a good start on the NLP deep learning journey.

#deep-learning #nlp #tensorflow #neural-networks

towardsdatascience.com

Hands-on NLP Deep Learning Model Preparation in TensorFlow 2.X