Text to speech and speech to text synthesizer using Swift
Text to speech ( TTS)
In Swift programming, you can work with text-to-speech (TTS) and speech-to-text (STT) using the AVFoundation framework, which provides the AVSpeechSynthesizer and AVSpeechRecognizer classes.
Example of AVSpeechSynthesizer to convert speak a text:
import AVFoundation
let synthesizer = AVSpeechSynthesizer()
let utterance = AVSpeechUtterance(string: "Hello, world!")
utterance.voice = AVSpeechSynthesisVoice(language: "en-US")
utterance.rate = 0.5
synthesizer.speak(utterance)
This will use the default voice for the specified language (in this case, English (United States)) to speak the text “Hello, world!” at a rate of 0.5.
The built-in speech synthesizer is capable of speaking multiple languages such as Chinese, Japanese and French. To tell the synthesizer the language to speak, you have to pass the correct language code when creating the instance of AVSpeechSynthesisVoice
.
To find out all the language codes that the device supports, you can call up the
speechVoices()
method ofAVSpeechSynthesisVoice
:
let voices = AVSpeechSynthesisVoice.speechVoices()
for voice in voices {
print(voice.language)
}
Here are some of the supported language codes:
- Japanese — ja-JP
- Korean — ko-KR
- French — fr-FR
- Italian — it-IT
- Cantonese — zh-HK
- Mandarin — zh-TW
- Putonghua — zh-CN
If you need to interrupt the speech synthesizer. You can use
stopSpeaking
method to stop the synthesizer:
speechSynthesizer.stopSpeaking(at: .immediate)
You can also control other aspects of the speech, such as the pitch, volume, and whether to speak asynchronously or synchronously.
utterance.pitchMultiplier = 1.5
utterance.volume = 0.7
utterance.preUtteranceDelay = 0.5
// Speak asynchronously
synthesizer.speak(utterance)
// Speak synchronously
synthesizer.speak(utterance)
synthesizer.pauseSpeaking(at: .word)
Speech-to-text (STT) Recogniser
This code sets up an SFSpeechRecognizer object with the English (United States) locale, creates a recognition request, and starts a recognition task. When the task completes and the resulting text is printed to the console.
import AVFoundation
import Speech
let recognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))
let request = SFSpeechRecognitionRequest()
recognizer?.recognitionTask(with: request, resultHandler: { (result, error) in
if let error = error {
print(error.localizedDescription)
return
}
guard let result = result else { return }
print(result.bestTranscription.formattedString)
})
Speech recognition requires user permission and may not be available in all regions or languages and make sure to handle errors and edge cases appropriately.