NeuralSpace Release Notes: What’s new in Version 1.4.0?

Felix Laumann
NeuralSpace
Published in
3 min readNov 14, 2022

We are constantly developing the NeuralSpace Platform to give our users more and more language AI capabilities. We’ve packaged these together in our latest release, which we have called Nico Robin.

With two brand new speech AI services, we have delivered some major Platform updates. Scroll down to catch up!

Speaker Identification

With this release, we launch our Speaker Identification service. This service automatically identifies the number of speakers in an audio file and defines which parts of the audio belong to which speaker.

This is a perfect tool to transcribe meetings, videos with multiple speakers, phone calls, etc.

Voice Extraction

Our new Voice Extraction service smoothly separates the audio of a speaker from the background noise in an audio file. This helps to improve the quality of transcriptions especially when there is a lot of background noise.

This a perfect tool for auto-overdubbing videos. With a service like this, you will never have to worry about the background noise present in the video. You can extract the voice and background audio and then overlay your overdub audio on the background audio.

File management system

Files can now be shared across different services. For example, the file you upload to use for Transcription can be also used for Voice Extraction through a unique file ID which you will get upon uploading the file.

Analytics page

A new analytics page is now available for Language Understanding and Entity Recognition. It tells you which intents and entities are performing well and which ones aren’t. Along with a detailed classification report, you get an interactive confusion matrix (shows the performance of the trained model) for both services.

Model Analytics Page on the NeuralSpace Platform

Intent Confusion Matrix on the NeuralSpace Platform

Webhook Concept

You can register a webhook on the platform now and get live updates on all asynchronous tasks. E.g., model status while it is training, file status during transcription, voice extraction, or speaker identification. You can also get the status of batch TTS requests. This way you will never have to poll the status API, again and again, to perform a subsequent task based on it’s status.

Language Support

Speech-to-Text

We have expanded our Language Support to 74 languages spoken across the Asian, African, the Middle Eastern and European regions. More languages and domains will be added soon. Feel free to reach out to us if you have any preferences.

Text-to-Speech

Our Text-to-Speech service covers over 40 languages and more than 200 AI voices! More languages and voices will be added soon. Feel free to reach out to us if you have any preferences.

What’s Next?

  • In the next release, we aim to introduce a brand new AutoNLP pipeline that runs faster and has amazing results to offer.
  • Speech-to-Text models will be able to adapt as per your custom vocabulary with only textual data.
  • Text-to-Speech models with the ability to accurately clone celebrity voices.
  • Speaker identification using speaker samples. : You can upload 30s audio files for specific speakers to identify them by name.

Feel free to reach out to us or book a call directly if you’d like to talk with our team in more detail.

If you haven’t yet, sign-up on the NeuralSpace Platform to try and test it out by yourself! Get started with $200 worth of credits.

Be sure to check out our Documentation to read more about the NeuralSpace Platform and its different services.

Happy NLP!

--

--

NeuralSpace
NeuralSpace

Published in NeuralSpace

The most accurate speech and text APIs for locally spoken languages in Asia, the Middle East and Africa. Hosted in your private cloud.