5x Faster Voice Cloning | Tortoise-TTS-Fast | Tutorial

Martin Thissen
7 min readApr 20, 2023

In this article I will show you how to create high-quality speech with a cloned voice.

In my previous articles I have already shown how to use the Tortoise-TTS library to generate speech with a cloned voice. However, many told me while the voice quality is good, the inference or generation of the speech is way too slow. And I have good news: There is an improved version of the Tortoise-TTS model that is at least 5x faster. When I tried it myself, I was amazed by the speedup!

If you like videos more, feel free to check out my YouTube video to this article:

Before I show you how to use the improved tortoise-tts-fast library, let’s first understand how this significant improvement was possible.

Main Improvements

For this, the author of the tortoise-tts-fast repository states that the following features allow for faster inference:

KV Cache

This is the biggest improvement made in the tortoise-tts-fast library and is a concept most of you are already familiar with: caching. Instead of saving bandwidth (as with your browser cache), by using the KV cache, we save computation at the cost of extra…

--

--

Martin Thissen

Writing Articles on How to Use AI Models and How They Work