Automatic speech recognition — Experimentations

7 min readJun 8, 2024

In this article, I’m taking you through how I tried out automatic speech recognition in my PC. The beginning is always weak, but it is important to embrace the weakness. In my bachelor’s degree, we had a module called signal and systems. In that module, we learned about how audio signal processing takes place. Also, it means how analog sound waves convert into digital signals, in a way that a machine can store and manipulate. Nowadays we have Google Cloud Speech Recognition, Sphinx, Wit.ai , Microsoft Bing Voice Recognition, Microsoft Azure Speech,Houndify, IBM Speech to Text, whisper and Whisper API (in Openai). Let’s try the ones that doesn’t need any APIs.

SECTION 1 — Educate myself about automatic speech recognition

It’s important to have quick cookies of terms to know around speech recognition, afterward, it will be easy to absorb what happens in the experimentations.

The first basic thing to know is the process of audio signal processing.

Sampling: The analog sound wave is measured as frequency
Quantization: Breaking down the signal wave into equal bits
Encoding: After slicing it, for each bit, we give an encoded label for the nearest signal frequency.

Automatic speech recognition — Experimentations

SECTION 1 — Educate myself about automatic speech recognition

Written by Ishara Usoof