A Language Classifier powered by Recurrent Neural Network implemented in Python

A Language Classifier powered by Recurrent Neural Network implemented in Python

A Language Classifier powered by Recurrent Neural Network(RNN) implemented in Python without AI libraries.

RNN-Language-Classifier

A Language Classifier powered by Recurrent Neural Network(RNN) implemented in Python without AI libraries.

Features

The classifier classifies a word in EnglishSpanishFinnishDutch, or Polish. The classifier outputs correctly at a rate of approximately 85%.

It is purely implemented with numpy and built-in libraries.

Model Architecture

  • Input Layer: 47 nodes representing 47 different characters
  • First Hidden Layer: 100 nodes
  • Second Hidden Layer: 100 nodes
  • Output Layer: 5 nodes representing 5 languages

The technique used in this project is called Recurrent Neural Network(RNN):

Here, an RNN is used to encode the word “cat” into a fixed-size vector h3.

Sample Run

Training until validation accuracy achieve a certain level:
epoch 1 iteration 24 validation-accuracy 43.0%
  shaking    English ( 22.4%) Pred: Dutch   |en 22%|es 20%|fi 18%|nl 26%|pl 14%
  relaxing   English ( 23.7%) Pred: Dutch   |en 24%|es 20%|fi 18%|nl 25%|pl 13%
  prophecy   English ( 17.6%) Pred: Spanish |en 18%|es 24%|fi 24%|nl 16%|pl 19%
  tiroteo    Spanish ( 25.8%)               |en 21%|es 26%|fi 18%|nl 18%|pl 17%
  vientre    Spanish ( 24.2%)               |en 17%|es 24%|fi 21%|nl 21%|pl 17%
  estupenda  Spanish ( 31.4%)               |en 16%|es 31%|fi 18%|nl 19%|pl 16%
  osti       Finnish ( 21.2%) Pred: Polish  |en 15%|es 19%|fi 21%|nl 20%|pl 25%
  veljensä   Finnish ( 19.8%) Pred: Spanish |en 21%|es 22%|fi 20%|nl 20%|pl 18%
  aikoinaan  Finnish ( 22.3%)               |en 15%|es 21%|fi 22%|nl 21%|pl 21%
  betwijfel  Dutch   ( 22.8%) Pred: English |en 24%|es 23%|fi 15%|nl 23%|pl 15%
  merkte     Dutch   ( 17.1%) Pred: Spanish |en 17%|es 22%|fi 22%|nl 17%|pl 21%
  beseffen   Dutch   ( 24.5%)               |en 21%|es 19%|fi 21%|nl 25%|pl 15%
  kończę     Polish  ( 21.5%) Pred: Spanish |en 17%|es 23%|fi 20%|nl 18%|pl 21%
  firmy      Polish  ( 20.7%) Pred: Finnish |en 15%|es 22%|fi 23%|nl 19%|pl 21%
  decyzje    Polish  ( 16.2%) Pred: Dutch   |en 19%|es 22%|fi 20%|nl 23%|pl 16%

.
.
.

epoch 6 iteration 153 validation-accuracy 84.2%
  shaking    English ( 86.4%)               |en 86%|es  0%|fi  1%|nl 12%|pl  1%
  relaxing   English ( 84.6%)               |en 85%|es  0%|fi  0%|nl 15%|pl  0%
  prophecy   English ( 54.2%)               |en 54%|es  0%|fi  0%|nl  4%|pl 41%
  tiroteo    Spanish ( 38.9%)               |en 12%|es 39%|fi 36%|nl  6%|pl  8%
  vientre    Spanish ( 43.4%)               |en 19%|es 43%|fi  2%|nl 29%|pl  7%
  estupenda  Spanish ( 75.2%)               |en  1%|es 75%|fi 15%|nl  2%|pl  7%
  osti       Finnish ( 75.7%)               |en  1%|es  1%|fi 76%|nl  3%|pl 20%
  veljensä   Finnish ( 81.7%)               |en  0%|es  1%|fi 82%|nl 17%|pl  0%
  aikoinaan  Finnish ( 99.9%)               |en  0%|es  0%|fi100%|nl  0%|pl  0%
  betwijfel  Dutch   ( 98.7%)               |en  1%|es  0%|fi  0%|nl 99%|pl  1%
  merkte     Dutch   ( 71.9%)               |en 10%|es  1%|fi  6%|nl 72%|pl 10%
  beseffen   Dutch   ( 96.6%)               |en  2%|es  0%|fi  0%|nl 97%|pl  0%
  kończę     Polish  (100.0%)               |en  0%|es  0%|fi  0%|nl  0%|pl100%
  firmy      Polish  ( 29.4%) Pred: English |en 59%|es  5%|fi  2%|nl  5%|pl 29%
  decyzje    Polish  ( 87.7%)               |en  1%|es  1%|fi  0%|nl 10%|pl 88%
Test Results:
test set accuracy is: 83.800000%
User Input:
word: tervetuloa ## welcome
predicted language is: Finnish, with a confidence of 80.011147%

word: ciudades ## cities
predicted language is: Spanish, with a confidence of 88.442353%

word: właź ## hatch
predicted language is: Polish, with a confidence of 99.979566%

word: algorithm
predicted language is: English, with a confidence of 79.893499%

word: resolution
predicted language is: English, with a confidence of 94.786443%

word: ademt ## breathe
predicted language is: Dutch, with a confidence of 47.399565%

word: invitar ## invite
predicted language is: Spanish, with a confidence of 93.986880%

Dependencies

You will need numpy for this project

pip install numpy

How To Use

clone this project or download the zip file

py run.py

Improvements To Make

  • support save & load models
  • classify more languages
  • improve accuracy
  • classify a sentence or paragraph instead of words
  • ...

Reference

The dataset lang_id.npz, image demonstrating RNN, and project skeleton are from cs188.ml.

GitHub

machine learning python

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

How To Plot A Decision Boundary For Machine Learning Algorithms in Python

How To Plot A Decision Boundary For Machine Learning Algorithms in Python, you will discover how to plot a decision surface for a classification machine learning algorithm.

How To Get Started With Machine Learning With The Right Mindset

You got intrigued by the machine learning world and wanted to get started as soon as possible, read all the articles, watched all the videos, but still isn’t sure about where to start, welcome to the club.

What is Supervised Machine Learning

What is neuron analysis of a machine? Learn machine learning by designing Robotics algorithm. Click here for best machine learning course models with AI

Python For Machine Learning | Machine Learning With Python

Python For Machine Learning | Machine Learning With Python, you will be working on an end-to-end case study to understand different stages in the Machine Learning (ML) life cycle. This will deal with 'data manipulation' with pandas and 'data visualization' with seaborn. After this an ML model will be built on the dataset to get predictions. You will learn about the basics of scikit-learn library to implement the machine learning algorithm.

Python for Machine Learning | Machine Learning with Python

Python for Machine Learning | Machine Learning with Python, you'll be working on an end-to-end case study to understand different stages in the ML life cycle. This will deal with 'data manipulation' with pandas and 'data visualization' with seaborn. After this, an ML model will be built on the dataset to get predictions. You will learn about the basics of the sci-kit-learn library to implement the machine learning algorithm.