Speech Recognition with Scala

Knoldus Inc.
Knoldus - Technical Insights
2 min readJul 6, 2016

In this blog I am going to explain how to integrate speech recognition in your Scala project.

Speech recognition enables us to integrate the recognition and translation of spoken language into our projects in form of text. Speech recognition is really upcoming feature in electronic and computer devices so as to make them smarter.

In the project we shall be using the CMU Sphinx Toolkit. It allows us to integrate offline speech recognition. It is an open source toolkit which provides us with several speech recognizer components. There are several components available depending upon the needs of the application. The available components include speech recognizer , speech decoders, software for acoustic model training,language model and pronunciation dictionary.

We shall be using the Sphinx 4 speech recognizer, it is a pure java speech recognition library. It is used for identification of speech devices,adapt models and to recognize and translate speech .

Now let us look at the code.First of all let us include the following two dependencies in our build.sbt

libraryDependencies += "edu.cmu.sphinx" % "sphinx4-core" % "1.0-SNAPSHOT"
libraryDependencies += "edu.cmu.sphinx" % "sphinx4-data" % "1.0-SNAPSHOT"
or
libraryDependencies += "de.sciss" % "sphinx4-core" % "1.0.0"
libraryDependencies += "de.sciss" % "sphinx4-data" % "1.0.0"

The next step is to import the following into the application

import edu.cmu.sphinx.api._

Next comes the code for setting up the configuration for speech recognition

object SpeechRecognitionApp extends App {
val configuration = new Configuration
configuration.setAcousticModelPath("file:models/acoustic/wsj")
configuration.setDictionaryPath("file:models/acoustic/wsj/dict/cmudict.0.6d")
configuration.setLanguageModelPath("models/language/en-us.lm.dmp")
}

The code above will create a configuration variable which is responsible for setting up the acoustic model, language model and path to the dictionary.

The cmudict is a text file used as dictionary of recognizable words which can be extended for a given language. It looks something like this

ONE                  HH W AH N
ONE(2) W AH N
TWO T UW
THREE TH R IY

Now the last step is to create the recognition object, enable it to start recognition and store the result.And we have stored that result in form of a string.

println("Start speaking :")
val speechRecognizer = new LiveSpeechRecognizer(configuration)
speechRecognizer.startRecognition(true)
var result = speechRecognizer.getResult
while ({result = speechRecognizer.getResult; result != null}) {
println(result.getHypothesis)
}

Now we are done, so we can use the above code to enable speech recognition into our Scala Project.

References

--

--

Knoldus Inc.
Knoldus - Technical Insights

Group of smart Engineers with a Product mindset who partner with your business to drive competitive advantage | www.knoldus.com