1) 生成需要识别的wav文件,SpeechRecognition需要wav文件,不能识别mp3文件
安装库:
sudo apt install espeak ffmpeg libespeak1
pip install pyttsx3
代码:
def demo_tts_wav():
import pyttsx3
engine = pyttsx3.init()
engine.setProperty('rate', 150)
engine.setProperty('volume', 1.0)
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[0].id)
text = '你好,我是一个AI机器人'
#engine.say(text)
filename = 'ni_hao.wav'
engine.save_to_file(text, filename)
engine.runAndWait()
2. 语音识别,使用speech_recognition
安装库:
pip install SpeechRecognition
pip install pocketsphinx
下载模型文件:CMU Sphinx - Browse /Acoustic and Language Models/Mandarin at SourceForge.net
pip install vosk
下载模型文件到代码目录下:VOSK Models
解压,并且重命名为model
代码
def demo_speech_recognition():
import speech_recognition as sr
r = sr.Recognizer()
try:
audio_file = sr.AudioFile('ni_hao.wav')
with audio_file as source:
audio_data = r.record(source)
#text = r.recognize_google(audio_data, language='zh-Cn')
#text = r.recognize_wit(audio_data)
text = r.recognize_vosk(audio_data, language='zh-Cn')
print("识别结果:", text)
except Exception as e:
print("无法识别语音:", str(e))