Transcribing an audio file using the Speech Recognition library and Python
2 min readJul 29, 2022
What is transcription?
Transcription is the process of converting speech from an audio or video file into text. This process involves more than just listening to recordings, the content must be understood and nothing left out. Transcribing audio or video files can be done manually or using computer programming. The Python programming language has a library called Speech Recognition that is used for transcription.
TRANSCRIBE AN AUDIO FILE FROM THE OPEN SPEECH REPOSITORY
Step 1: Make a new folder adding a .wav file
Click HERE to find a .wav file of your choice.
Step 2: Generate a virtual environment
Generate a virtual environment using the appropriate method.
python -m venv virtual
Step 3: Access the virtual environment
virtual\Scripts\activate
Step 4: Install the Speech Recognition library to the virtual environment
Step 5: Create a Python file
import speech_recognition as sr# Initialize recognizer class
r = sr.Recognizer()# audio object
audio = sr.AudioFile("harvard.wav")#read audio object and transcribe
with audio as source:
audio = r.record(source)
result = r.recognize_google(audio)
print(result)