.NET Speech-To-Text on the Edge
Modern audio transcription is moving off the cloud and back to the device. For developers, this means no roundtrip data transfers — decreasing bandwidth usage, computation costs, and latency. For users, this means better privacy and security.
Historically, there’s been several roadblocks to getting speech-to-text to the edge: language model size, platform support, and cumbersome APIs have been just some of the obstacles. Fortunately, with the recently released Picovoice Leopard Speech-to-Text SDK for cross-platform .NET, the road is clear and it’s time to get coding!
1- Install Leopard NuGet Package
Install Leopard with the NuGet package browser or with the nifty dotnet CLI:
dotnet add package Leopard
2- Get a Picovoice AccessKey
Sign up for Picovoice Console using an email or your GitHub account and grab your free AccessKey. Picovoice’s Free Plan includes free transcription with Leopard. No credit card required.
3- Code
In your .NET project, create an instance of Leopard:
using Pv;string accessKey = "${ACCESS_KEY}"
Leopard leopard = Leopard.Create(accessKey);
You can pass an audio file to Leopard or record audio and pass in the raw data. In this example we pass in an array of data from our microphone:
short[] audio = //.. audio obtained from microphone
string transcript = leopard.Process(audio);
Here is a basic .NET console app in action:
Check out the full source code here!