Base image and Logo by Subhashish Panigrahi [CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0)] via Wikimedia Commons. Edited by Author.

Kathabhidhana (କଥାଭିଧାନ)

An open toolkit to batch-record words

Prateek Pattanaik
Prateek Pattanaik
Published in
2 min readOct 3, 2017

--

Kathabhidhana is an open toolkit to record a large number of words. It consists of a few free/libre and open source software, open datasets, methodologies and documentations. It can be used to record pronunciations of words to make a talking dictionary to record phonemes to create a text-to-speech software.

A video intro to Kathabhidhana

What it does

Wikipedia has a sister project called Wiktionary, a multilingual dictionary where you can not just find meaning of words from your own language but also equivalent meanings of foreign language words. Unlike many available dictionaries that help learn proununciations, Wiktionary does not have pronunciations of all words in all the languages.

Workflow

Kathabhidhana was originally started by to add pronunciations to the Odia-language Wiktionary. It is adopted from a free software created by by Shrinivasan T. It works both on Linux and Mac. The iOS version of Kathabhidhana was created by yours truly. You can certainly create pronunciations and add them to Wiktionary. But you can use Kathabhidhana beyond that by making a large library of pronunciations that can be used to build any machine learning or Natural Language Processing (NLP) tool.

Use the tool

Head over to our handy Github repo with everything you need. The iOS part is inside the Kathabhidhana for iOS folder. Here’s the link.

UPDATE : Kathabhidhana is now a part of OpenSpeaks. Head over here to know more.

--

--