I am, without a doubt, one of those human beings known as “Star Wars fans”. I remember watching The Phantom Menace on TV and being blown away by it when I was a kid (I know, it is not a great movie). Later, I watched Attack of the Clones and Revenge of the Sith (this last on the theater) and went on to watch the Original Trilogy on DVD, back when those were still around. This culminated with me, as an adult, having a tattoo of a Tie Fighter on my right arm.

This passion for Star Wars was a good answer to the question “what NLP project should I do?”; I wanted to develop a complete Natural Language Processing project, practicing several skills while doing it.

This is a brief documentation of a project experiment on LSTM’s and how to train a language model to generate texts within a specific domain at a character level. Given a seed title, it writes a brief description of it through an API. The model was built using Tensorflow from scratch, without transfer learning. It uses texts from Wookiepedia.com, and the team at Fandom.com (responsible for managing the website) was kind enough to allow me to collect the data using a web scraper and to publish this article.

Final Product

A demonstration of the working model can be seen below.

Requirements

In order to replicate this model, you need to download the code from here. Then, install all needed dependencies (it is very recommended to create a new virtual environment and install all packages on it) if you are using Linux with:

pip install -r requirements-linux.txt

If you are on Windows, however, just use:

pip install -r requirements-windows.txt

After all dependencies are installed, you need the trained model to be able to generate any text. Because of size restrictions on GitHub, the model must be downloaded from here. Just download the entire model folder and put it under the deploy folder. Finally, with the environment you installed all dependencies on activated and with an open terminal on the folder you downloaded the project, type:

python deploy/deploy.py

#bots #tensorflow #machine-learning #lstm #nlp

Generating Short Star Wars Text With LSTM’s
1.15 GEEK