5x Faster Voice Cloning | Tortoise-TTS-Fast | Tutorial

Martin Thissen
7 min readApr 20, 2023

In this article I will show you how to create high-quality speech with a cloned voice.

In my previous articles I have already shown how to use the Tortoise-TTS library to generate speech with a cloned voice. However, many told me while the voice quality is good, the inference or generation of the speech is way too slow. And I have good news: There is an improved version of the Tortoise-TTS model that is at least 5x faster. When I tried it myself, I was amazed by the speedup!

If you like videos more, feel free to check out my YouTube video to this article:

Before I show you how to use the improved tortoise-tts-fast library, let’s first understand how this significant improvement was possible.

Main Improvements

For this, the author of the tortoise-tts-fast repository states that the following features allow for faster inference:

KV Cache

This is the biggest improvement made in the tortoise-tts-fast library and is a concept most of you are already familiar with: caching. Instead of saving bandwidth (as with your browser cache), by using the KV cache, we save computation at the cost of extra…



Martin Thissen

Writing Articles on How to Use AI Models and How They Work