Speech is first converted from physical sound to electrical energy using a microphone and then to digital data using an analog to digital converter. This digital data can be converted into text using various algorithms
Multiple speech recognition packages are available in Python, all of which provide different functionalities. One of the packages is the SpeechRecognition
package that can be installed by running the following command on the terminal:
pip install SpeechRecognition
After installing this package, we can implement the speech recognition functionality of Python, as shown below:
import speech_recognition as srdef takecommand():r = sr.Recognizer()with sr.Microphone() as source:print('listening....')r.pause_threshold = 1audio = r.listen(source, timeout=3, phrase_time_limit=5)try:print("Recognizing....")query = r.recognize_google(audio, language= 'en-in')print("Let's talk about {}.".format(query))except Exception as e:print("voice not recognized")
pause_threshold
value is the number of seconds the system will take to recognize the voice after the user has completed their sentence.timeout
value is the maximum number of seconds the system will wait for the user to say something before it throws an OSError
exception.phrase_time_limit
value indicates the number of seconds the user can speak. In this case, it is 5
. This means that if the user will speak for more than 5 seconds, that speech will not be recognized.The code above can only recognize the speech if it is in the English language since the language is set as except
block, and "voice not recognized"
will be printed as we have encountered an exception.