Milestone 1 : Conquered
We look far ahead and calls it ‘future’ but fails to realize the coming step is closest milestone for future.
Ok so the week have been very interesting. As usual I started off doing the same thing that I have been doing for the past 3 weeks which is transcribing the Malayalam words to phonetic representation. I kept saying that this thing is getting boring day by day for myself. With the mid term evaluation starting from Monday of next week I had to complete the phonetic transcription by Saturday or Sunday. This seems very unlikely given the pace that I am continuing with.
To keep me busy and not bored of Google summer of code ( even though I wanted this ) I thought of learning a new language which would be useful for the coming future. The Rust language. It seemed to take my mind off for a little time but it was not of much as help as I thought.
In the meantime recording of sound or speech was going on among my friends and there has been some updates regarding that. I have completed recording 115 sentences in the midst of the transcription dilemma.
June 16 — update
Ok I think I have an idea. I have to wait and see if this will work or not within a day. Well today something really helpful and timely stuff happened. I have been working on an Android app to run the model. Unfortunately it was always crashing until today when I fixed it finally. The bug was really, really really small. In fact it was embarrassing when I found out what the bug was. I had to use capitalised letter (** duh, are you serious! **) while specifying a search name tag. Now how come I missed that! The model that I developed for my major project was running absolutely fine in the app. In fact it so happens that it have better accuracy than it did on my PC.
feeling joyful… yayy!
But this was not the idea that I was talking about at the beginning of the paragraph. (** giggles **)
Ok I was not joking.
Well the idea would have been more beneficial if I had this in mind at the starting of phonetic transcription but like anybody else' my brain won’t work when I need it.
Anyways the idea was to use find and replace option of Notepad++ to batch edit all the words but there is a catch that if I am to edit the words then I must save about 3 to 4 days. Keeping in mind that if this experiment goes in vain then I am going to need more days than I actually need if I was to follow the steps that I used up until now.
I decided to try the experiment anyways because if I succeed in it then I am going to save more days than I am going to lose if I do not. And probably couple of days before my schedule.
I think I make some sense!?!
So here is a brief of what I actually typed in my so called experiment.
- The b.txt and a.txt files initially contain raw malayalam words that needs to be represented phonetically.
- In the alphabetic order, I find each character or sound ( eg. കാലം, കാപ്പി, കാറ്റ് after finding കാ and replacing it with KA will look as KA ലം, KA പ്പി, KA റ്റ് and so on) and replace all with the phonetic representation inside the b.txt file. This will save a lot of time rather editing each word line by line which I was doing till now.
- Once all the replacing is done, file will only have English characters (phones) ( eg. KA L1AM, KA PPI, KA T ). But this is not what the file should be like. The file should be like കാലം KA L1AM, കാപ്പി KA PPI, കാറ്റ് KA T and so on.
- To make it that way, I simply have to join the file with ML words ( a.txt ) and this new file with just the phonetic representation ( b.txt ). This is where the paste command of bash was used and find and replace again to get rid of newline. ( thanks to Aboobacker MK, for the quick reply with the bash command )
paste -d “ ” a.txt b.txt | tee out.txt
puts “until then ciao!”