What now?

“Now that the basic aim was fulfilled, what more can we work on, given there is almost half a month to GSoC Submission!”

Well, as of now the phoneme transcription was done purely based on the manner the word was written and not completely based on the Speech pattern. What I mean is that there are some exceptions in how we write the word and pronounce it (differently). This was pointed out by Deepa mam. She also asked if I could possibly convert some of the existing Linguistic rules(algorithms) that was made with Malayalam TTS in mind, so that it could be used to re-design the phoneme transcription. This could also turn out to be helpful for future use like using it for a fully intelligent Phoneme Transcriber for Malayalam Language Modeling.

This is what we are working on right now, and am literally like scratching my head over some loops in Python!

juzzzz jokinnn
The basic idea is to iterate over each line in the ‘ml.dic’ file and validate the transcription I made earlier with the set of rules. Correcting them (if found invalid) as it goes over.

Seems pretty straight forward! Will see how it goes!


Update — 4th August

Wew!, This is going nuts! OK so I first tried using Lists to classify the different types of phones. It all was good, until I reached a point in algorithm where I have to check if the current phoneme in the transcription is a member of a particular class of phoneme ( now, when I say, class of Phoneme, I just mean, the classification and not the class ). Of course I can search in List for the presence of the element and its quite sufficient enough to say in small comparisons. Our case is different. We are talking about around 7000 words in a file, on top of which each line will have significant amount of if-elif clauses.

This could slow down things and make the script less efficient ( will eventually see the difference ). So I went back to Python documentation and read about the Set Types ( set and frozenset )

A set object is an un-ordered collection of distinct hashable objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference. — said the Python doc.

This is exactly what I wanted. I mean, I don’t have to do any manipulation to the phoneme classes, so there is no real meaning in using a List. Furthermore, the Set supports the ‘in’ using which the membership can be checked with no additional searching procedure. How cool is that!

here!


Update — 9th August

So, after some test on the script, I generated the dictionary file once again, this time applying some of the TTS rules. Now the SphinxTrain is running with this dictionary file. Hopefully, there should be some change in the accuracy.!

left panel with new dictionary, right panel with old dictionary
left panel with new dictionary, right panel with old dictionary

This might as well be the last development phase update if all goes well. Then it is submission time.

puts 'until then ciao'