I am really interested to learn alien language to communicate with them when they visit our planet. So i need speech to text program to decipher their language.
In this tutorial, we are going to learn how to convert speech to text in python the easiest way.
Contents
1- What is speech recognition
2-How to install speech recognition package using anaconda
3- Using the Recognizer class
4-Taking input from microphones
5- Final code snippet
1- What is speech recogniton
Speech recognition is the ability of a computer software to identify words and phrases in spoken language and convert them to human readable text.
2-How to install speech recognition package using anaconda prompt
Speech Recognition package is compatible with Python 2.6, 2.7 and 3.3+. However, it requires some additional installation steps for Python 2.
In this tutorial, we use Python 3.7.3. If you don’t have anaconda installed in your machine click on https://www.anaconda.com/distribution/ to download
If you are using windows OS, go to all apps and look for anaconda3 (64 bit) folder and click on anaconda prompt.
Install the python speech recognition package using this command
conda install -c conda-forge speechrecognition
It should take 30 seconds to download the package if your internet connection is fast.
You can check the version of the installed package by typing on the anaconda prompt
import speech_recognition as speech_recog speech_recog.__version__ 3.6.3```
3-Using the Recognizer class
Speech Recognition package has a Recognizer class which recognizes the speech and convert it to text. Following are seven methods which can read various audio sources using different APIs.
recognize_bing( )
recognize_google( )
recognize_google_cloud( )
recognize_houndify( )
recognize_ibm( )
recognize_wit( )
recognize_sphinx( )
For offline speech to text conversion you need to install Pocketsphinx library
We need to create an instance of the Recognizer class. We’ll use the recognize_google() method to access the Google web speech API and convert spoken language into text. Besides, recognize_google() requires an argument audio_data otherwise it returns an error.
import speech_recognition as speech_recog
rec = speech_recog.Recognizer()
4-Taking Input From Microphones
To use the microphones, we have to install PyAudio module
From the anaconda prompt type the code command below
conda install -c conda-forge PyAudio
We use the microphone class to get the input speech from the microphone. For most of the projects, we use the default microphones. However, if you do not wish to use the default microphone, you can get the list of microphone names using the list_microphone_names() method.
To store the input from the microphone we use the record() method. We define a variable duration to recognize the spoken word from the user and assign a default value Please don’t forget to ident after the colon. Below is the code spippet:
with speech_recog.Microphone() as source:
audio_data = rec.record(source, duration=duration)
To convert the spoken word stored in audio_data variable to text, we use the method recognize_google() from google API and we pass the parameter audio_data to it. Below is the code snippet
text = rec.recognize_google(audio_data)
5- Final code snippet
Let’s put everything together. below is the final code snippet
import sys #read duration from the arguments duration = 5 rec = speech_recog.Recognizer() print("Please talk") with speech_recog.Microphone() as source: # read the audio data from the default microphone audio_data = rec.record(source, duration=duration) print("Recognizing...") # convert speech to text text = rec.recognize_google(audio_data) print(text)```
Hope you understand the implementation of speech to text in python. We build up in my next article to implement such system on raspberry py to control electronic gadgets.
#python #speechrecognition #anaconda #PyAudio