rohit gowtham
Voice Tech Podcast
Published in
4 min readJun 16, 2019

--

AI gift to a Father from a Techie

I am from a small middle-class family from southern India. My father works in the judicial department. During my childhood it’s rare I see my father coming home before I sleep. its rare that we both spend time and I have very fewer memories. when I was 12 years old one day I went to his office(court) and I found out that he is doing the toughest work I find at that time.

He used to sit in front of the judge and type the whole conversation some time he needs to remember them and type later. At that time we have those voice recorders available but in India that too a remote work location people don’t have that much knowledge in using them. when I was 17 I came out of my house for my graduation. when I visit my home often during the holidays my sister also felt similar as she was too young she couldn’t express it but I felt what it is.

During my third year of graduation I set my aim to help my father in any way then my mind turned towards “Speech Recognition”, I started to study the concepts of it. I started with speech recognition by Microsoft in Windows Vista and my final project ended in developing an application that responds back to the user speech either an action or a response. But this solution didn’t help my father much as they use the Telugu and English languages during the conversation.

Then I look for some open source engines and find CMU Sphinx is the one easy and faster to create and test a model. I started using it but the data available for Telugu is too low and I don’t have enough resources to create it. I have joined a small research firm and then moved to a couple more companies, finally established my own “ASRlytics” in 2016 but work started officially from 2017. We have collected enough amount of data and built a model and released a sample demo on March 2017 which is 2 months before Google released for Indian languages (https://www.youtube.com/watch?v=u-AtPCseaBA). We have built an engine that works in real time and even demonstrated a few of them in our youtube channel (https://www.youtube.com/channel/UCg9ZQ3LcwoTEYLUInNaN2PQ/featured)

Build better voice apps. Get more articles & interviews from voice technology experts at voicetechpodcast.com

I made a test trail with some sample documents that are close to my father work environment. But I felt that won’t work as the real scenario has both English and Telugu. So started working in building multi-lingual models which can recognize both languages at a time without any model switching. Here is a sample demo for it (https://www.linkedin.com/feed/update/urn:li:activity:6514532030338949120).

In the end, we tried with the real environment but still, we fail in the scenario where there is no adequate internet connection, and people don’t wait for a longer time to get the result back when they have uploaded an audio file which is pretty huge. we end up with a solution which is super fast and can take up any longer files that are possible practically tested 9 hours of audio processed in approximately 10 min. And this time it really worked well for all scenario’s As the waiting time after they upload a file is very minimum. A sample case trail run for 20–35 min which our engine will process in approx 30 seconds which is not a noticeable time. A demo is provided here (https://youtu.be/2t8LdI1gVyc) will soon make this publicly available.

I still need to approach higher officials of the judicial department to do some real-time trails and if all works well will try to make it a part of the judicial department. Which would really help a lot of children like me to spend their valuable childhood with their father. But in the meantime, I want to contribute this to my Father “Kodali Venkateswara Rao” (not a well-known person) who is a real hero who sacrificed his life by working restlessly for me and my growth. Even though I earn pretty good in the current job my father always wanted me to do a job in his field, I always rejected that idea but never answered that. This is the reason for it and I cannot take that much of load on me as he did. Thanks a lot for giving this life for me DAD you’re always awesome.

I would like to mention another person who supported me for most of my crazy ideas, Rajaraman Sundararajan (http://Cognizyr.com)without his support I wouldn’t have achieved this.

--

--