Machine Translation using Neural Networks

Neural machine translation (NMT) is an approach to machine translation that uses an artificial neural network to predict the likelihood of a sequence of words, typically modeling entire sentences in a single integrated model.

Before beginning with machine translation using neural networks, first, we need to see how can we represent words in a sequence as your model works on equations and numbers and it has no place for words it only understands numbers. To do this we use tokenization. Now, what is tokenization? Tokenization is used to create a word dictionary which is there in your corpus stripping the punctuation out, conducting stemming and converting them to lower case, then we assign a specific number to each of the words, called as tokens. This gives us a dictionary where each word is mapped to a specific token. After that, we apply one-hot encoding in order to prevent words with higher token values that have a higher priority or more weight. Machine translation uses encoder-decoder architecture as shown below

Encoder and decoder both use the same neural network model but play a somewhat different role. The encoder is used to encode all the word embeddings and extract context and long term dependencies which are then passed over to decoder to generate output sentence. There are different types of natural language processing models that can be used for this purpose. Now let’s start with the basic sequence model known as Recurrent Neural Networks.

Recurrent Neural Networks

The word recurrent means occurring often or repeatedly. In normal neural networks, we take an input x and feed it forward through our activation units in our hidden layers to get an output y, we do not take any input from the previous steps in the model. This is where we differ in recurrent neural networks, in rnns we not only get data from x[t] at step t but we also get information from a[t-1](activation at the previous step), we do this in order to share features learned across different positions of texts.

#attention #recurrent-neural-network #transformers #machine-translation #lstm #neural networks

Recurrent Neural Networks

medium.com

Machine Translation using Neural Networks