Leading the way in Tagalog Speech Recognition

Ayushman Dash
NeuralSpace
Published in
3 min readMay 20, 2024

We’re proud to share a breakthrough in Tagalog speech-to-text (STT) technology, setting a new standard for accuracy. With our strategic partnership with ABS-CBN, a powerhouse in Philippine media, we have built the most accurate Tagalog STT model, especially for subtitling.

Our model outperforms Google, Azure, and OpenAI, with a 81.55% higher accuracy than Google.

NeuralSpace and ABS-CBN’s Partnership

In the Philippines, with its significant Tagalog-speaking population, the demand for sophisticated and accurate STT services is particularly strong. Addressing this need, ABS-CBN, one of the leading media and entertainment organizations in the country, partnered with NeuralSpace. Their goal was ambitious: to implement NeuralSpace’s LocAI solution for subtitling in Tagalog, leveraging an advanced STT model to achieve the highest accuracy ever seen in the industry.

The collaboration between ABS-CBN and NeuralSpace was strategic, aimed at integrating the most accurate Tagalog STT model into ABS-CBN’s broadcasting and content services.

Incorporating NeuralSpace’s solution, ABS CBN shifted from a highly labor-intensive subtitling operation to an automated AI workflow wih 2X faster turnaround time and 50% projected cost savings.

Read more about our partnership with ABS CBN.

Self-Learning AI Mechanism

At the heart of this initiative was NeuralSpace’s self-learning AI mechanism, which was integrated into the LocAI platform. This mechanism enabled the STT model to continuously learn and adapt from real-world usage. As ABS-CBN utilized the system for live and pre-recorded broadcasts, the AI was able to refine its accuracy by learning from corrections and user feedback, adapting to the nuances of spoken Tagalog, including local dialects and informal speech.

This dynamic learning process was instrumental in enabling the STT model to improve over a short span of two months, demonstrating significant advancements in speech recognition technology. The self-learning capability of NeuralSpace’s AI not only set new standards in accuracy but also showcased the potential of adaptive technologies in enhancing media production and distribution.

Setting a new Benchmarking in Tagalog STT

Leveraging ABS-CBN’s proprietary dataset for testing was pivotal, requiring approximately 100 hours of meticulous training to achieve optimal results.

Benchmarking Methodology

We used the most common method to test the accuracy of STT systems, which is Word Error Rate (WER). This metric determines the percentage of words in the STT output that differ from the actual, 100% accurate “ground truth” transcription. The WER is calculated by dividing the total number of errors, which includes substitutions, deletions, and insertions, by the total number of words in the ground truth transcription.

WER Calculation

A lower WER indicates a higher accuracy of the STT system.

NeuralSpace’s model achieves the lowest WER (highest accuracy) outperforming three leading STT providers, with an 81.55% higher accuracy than Google.

Conclusion

The collaboration between ABS-CBN and NeuralSpace has not only set a new benchmark in Tagalog speech recognition but also underscored NeuralSpace’s commitment to developing superior AI solutions tailored to local languages. This initiative reflects a broader strategy to bridge language barriers and enhance technological accessibility on a global scale.

NeuralSpace’s prowess is evident not only in its success with Tagalog but also with its highly acclaimed Arabic STT model, which is recognized as the most accurate in the industry.

Get in touch to learn more about NeuralSpace or visit our website.

--

--