Getting Started with Speech-to-Text

Build a client that converts speech to text in under 5 minutes.

Houndify offers an easy way for developers to use the platform for its speech to text capabilities via the Speech to Text Only domain. In this short tutorial, we will build a client that uses this domain.


When should you use Speech to Text?

Use the “Speech to Text Only” domain if you only need to convert speech to text.

Before using this domain, we encourage developers to explore the public domains as well as the custom commands feature of Houndify. By enabling domains that understand the meaning, you can benefit from the Speech-to-Meaning technology of Houndify which improves both speed and accuracy.


Difference between Regular Domains & Speech to Text Only Domain

Generally, Houndify domains are used to provide a response to a voice or text query, along with a transcription.

However, the Speech to Text Only domain is different as it only returns the formatted transcription of the query instead of a response.


Creating a Client that uses Speech-to-Text

  1. Click on the New Client button from your dashboard.
  2. Give your client a unique name. We’ll call ours “Speech-to-Text Test Client”.
  3. Click Save & Continue.
  4. In the Domain Selection page, search for the “Speech to Text Only Domain” and enable it. Do not enable any other domains.
  5. Click Save & Continue.

Now, let’s test out this new client using the Try API.

Testing Speech to Text Capabilities

  1. Click Try the Houndify API from the sidebar.
  2. Click the microphone icon and speak a phrase. Let’s try “what’s the weather in San Francisco?
  3. Check the JSON response that is returned. In particular, the AllResults object will have properties: RawTranscription and FormattedTranscription which contain the transcription of the voice input that we just entered.
Transcription returned from Speech to Text Only Domain

Use Cases and Considerations

  1. We encourage using the Speech to Text Only domain when you do not see an available domain that meets your use case.
  2. If you know that a query can be mapped to a specific domain, enabling that domain will lead to more accurate results. If a domain captures the query, you can access the transcription within the JSON response.

Next Steps

The Speech to Text Only domain can help you build complex voice-enabled applications by analyzing the text response. Check out these other related tutorials:

If you have any questions using the Speech to Text Only domain, contact us. If you don’t have a Houndify account, you can sign up for free.

Like what you read? Give Tilo a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.