Nowadays, we can use high precision voice recognition in our smartphone or any smart devices. However, those systems are provided by big companies like Google, Amazon or Apple, and are not free.

Many people, including myself, thought that it was because of a lack of free data. However, nowdays, we can easily find free data on the Internet.

Voice datasets

Dataset sizes for some languages (2020/10/12)

  • English → 50 GB
  • German → 16GB
  • French → 15GB
  • Japanese → 265MB

Some other data here:

Tools

Maybe, it was because no tools are available; however TensorFlow I/O is avaible and provides necessary tools to manipulate sounds.

Then, we have all to implement a simple Speech Recognition System.

#tensorflow #seq2seq #speech-recognition #python

A Journey to Speech Recognition using TensorFlow
6.30 GEEK