A Journey to Speech Recognition using TensorFlow

Nowadays, we can use high precision voice recognition in our smartphone or any smart devices. However, those systems are provided by big companies like Google, Amazon or Apple, and are not free.

Many people, including myself, thought that it was because of a lack of free data. However, nowdays, we can easily find free data on the Internet.

Voice datasets

Mozilla Common Voice: https://commonvoice.mozilla.org/en/datasets

Dataset sizes for some languages (2020/10/12)

English → 50 GB
German → 16GB
French → 15GB
Japanese → 265MB

Some other data here:

Speech command Dataset: https://aiyprojects.withgoogle.com/open_speech_recording

Tools

Maybe, it was because no tools are available; however TensorFlow I/O is avaible and provides necessary tools to manipulate sounds.

https://github.com/tensorflow/io

Then, we have all to implement a simple Speech Recognition System.

#tensorflow #seq2seq #speech-recognition #python

medium.com

A Journey to Speech Recognition using TensorFlow