Time stretching algorithms and audio warping techniques used in modern music production
Hi! Today’s article will be about time stretching algorithms and several methods of aligning recorded audio takes to the beat tempo. It is natural that musicians are not machines and they can’t play the same musical part with the same intonation, attack, speed even if they are playing with a metronome.
But sometimes music production needs this perfect synchronization between notes played by musicians to produce top-quality music tracks. So, many mixing engineers have to do something to fix instrumental and voice timing and align it to the bar beats to sync to the song. In this article, I’ll do a short review of possible ways how to edit tracks and align them to the bars. I’m a guitarist, therefore all examples will be based on electric guitar recording 😊.
A bit of history
Before starting to talk about modern algorithms let’s spend a minute to talk about how it is done in the analog world. With analog recording media like tape and turntables, if you wanted to speed up or slow down already-recorded audio, you’d vary the playback speed. But along with the change in speed comes a change in pitch — the familiar “Chipmunk effect” where the tone of the audio gets either squeaky and high-pitched (speed-up) or deep and spooky (slow-down). Even with digital media, if you simply speed up or slow down the playback (i. e. change the sample rate), you’ll get the same artifacts.
Modern time-shifting algorithms don’t change the pitch of the audio, but they are doing some additional processing under the hood. At first, audio must be broken up into individual notes and syllables. This is done by transient detection and in some cases also by frequency analysis (material that has separate notes without strong transients). Switching between algorithms reveal individual markers for the newly separated notes, which are embedded in the audio file’s header. If you want a more detailed description of whats going on under the hood, please reference publications and videos in the latter part of this article. 
Time-Scale Modification Overview
Time-scale modification (TSM) is the task of speeding up or slowing down an audio signal’s playback speed without changing its pitch. In digital music production, TSM has become an indispensable tool, which is nowadays integrated in a wide range of music production software. Music signals are diverse — they comprise harmonic, percussive, and transient components, among others. Because of this wide range of acoustic and musical characteristics, there is no single TSM method that can cope with all kinds of audio signals equally well.
Therefore we know that for every specific musical instrument we have their own algorithms which do TSM in the best way. Let’s see what algorithms are provided by Cubase Pro 9.5.
Elastique Pro is the all-around solution for time-stretching and pitch-shifting. The Pro version is suited for professional use without introducing usual phasing artifacts and thus providing sharp transients and crystal clear vocals. It is made for best audio quality, without formant preservation and Elastique Pro Formant includes formant preservation.
Elastique Efficient is suited for complex polyphonic signals like complete mixes etc. This algorithm has improved performance at the price of a slightly decreased quality level. Both versions can be used over monophonic and polyphonic audio . All algorithms work in 3 modes :
- Time — favors timing accuracy over pitch accuracy.
- Pitch — favors pitch accuracy over timing accuracy.
- Tape — locks the pitch shift to the time stretch as if playing back a tape with varying speed. If you stretch the audio material, the pitch decreases automatically. This variant has no effect if you use it with event transpose or the transpose track.
Cubase also offers its own standard algorithms under the name “Standard”. This review is taken from official Cubase documentation :
- Standard Drums — for percussive sounds. This mode does not change the timing of your audio. If you use it with certain tuned percussion instruments, you may experience audible artifacts. In this case, try the Mix mode as an alternative.
- Standard Plucked — for audio with transients and a relatively stable spectral sound character like plucked instruments.
- Standard Pads — for pitched audio with slower rhythm and a stable spectral sound character. This minimizes sound artifacts, but the rhythmic accuracy is not preserved.
- Standard Vocals — for slower signals with transients and a prominent tonal character like vocals.
- Standard Mix — for pitched material with a less homogenous sound character. This mode preserves the rhythm and minimizes the artifacts.
- Standard Custom — allows you to set the time stretching parameters manually.
- Standard Solo — for monophonic material like solo woodwind/brass instruments or solo vocals, monophonic synths or string instruments that do not play harmonies. This mode preserves the timbre of the audio.
If you select the Standard — Custom mode, a dialog opens where you can manually adjust the parameters that govern the sound quality of the time stretching:
- Grain size — allows you to determine the size of the grains in which the standard time-stretching algorithm splits the audio. Low grain size values lead to good results for material that has many transients.
- Overlap — this is the percentage of the whole grain that will overlap with other grains. Use higher values for material with a stable sound character.
- Variance — this is a percentage of the whole length of the grains, and sets a variation in positioning so that the overlapping area sounds smooth. A variance setting of 0 produces a sound akin to time stretching used in early samplers, whereas higher settings produce more rhythmic smearing effects but less audio artifacts.
Audio warping techniques in real cases
That’s great that we have such cool algorithms that can solve our issue but let’s have a look at how we can use them in real life.
- Split & Crossfade — without time warping
- Split & Time Stretch — with time warping
- Audio Warp — with time warping
- Slip editing — without time warping
We as music producers want to have the best quality sound using our DAWs to edit recorded takes. I’ll be using Cubase Pro 9.5 to show you how you can edit your audio takes in different ways. This techniques also can be applied to other DAWs like Reaper, Pro Tools, Ableton Live, Logic Pro etc. All examples will be based on DI guitar recording.
To have the best quality result you need to take into account several things:
- The tempo of the song
- Quality of recorded audio takes
- Tempo difference of recorded audio takes and project tempo
- The monophonic or polyphonic sound in audio take
- Signal-to-noise ratio
Monophonic — one note at a time (e.g. vocals, or when bassist or guitarist plays one note at a time)
Polyphonic — multiple notes at the same time (e.g. chords or harmonies)
Recorded audio takes for a musical instrument or vocals should be as close to the beat grid as possible to have the best results from warping. Let’s start with the basic guitar sample recorded at 110 BPM. As we can see, I intentionally played this part not in time to show to you the starting point how we can apply those techniques in real life. Almost all of the notes are played a faster than a metronome click. Therefore let’s fix it by using different techniques.
Split & Crossfade
I think this is the most simple technique to edit guitars. The main idea is to crop your recorded audio by beats (manually or automatically using beat detection in your DAW), snap your slices of audio to the beat grid and crossfade them to fill the gaps between if any. This can be done easily in any DAW, but I’ll be using Cubase for this. This is the best way to align tempo for percussive instruments like drums (e.g. kick, snare, toms).
First of all, create proper slices before the note transients using the “Split (Scissors)” tool. Those big peaks indicate that note has been started and guitar pick hit the string. The low noise before is a pick noise, therefore I prefer to not include it into the note transient.
Now you have sliced all the notes and should see a view like this:
After you moved all the notes to their correct positions a free space was created. This empty space should be filled by something because you don’t want to listen to silent pauses that easily distract listeners.
This space can be filled by changing a length of a selected note (pre- and post-length) and crossfading it with previous.
And the last step is to do this for all notes in the recording take. Of course, doing this manually can take some time so you can use automation tools. So the final result should look like this.
Split & Time Stretch
This concept is similar to previous but the main difference is in the last step — you should do a time warp of all samples to fit the bars entirely and make a crossfade for all samples to prevent clicking.
Let’s assume that you’ve sliced all the notes and placed them into correct beat positions.
Next step will be to warp every note to fill the bar and crossfade them to prevent unwanted noises and clicks. In Cubase we a have a special tool for it — “Close Gaps (Time Stretch)”.
After clicking on it it should process all your selected slices and close gaps warping your notes. As you can see every slice has a special marker (red selection) that tell you that this slice was warped.
Maybe it is a bit hard to see not, but when we zoom into the slice margins, we can see that there is a silence on both ends of the slice and some empty space. In this case, this empty space does nothing to our record take, because it has silence on both ends. But to be 100% sure that everything is ok you can close these gaps by using crossfades.
But in any case, you should be aware that their gaps are generated between note transients. It won’t be listenable in heavy rock or metal mixes but can be annoying on clean electric guitar sound or acoustic guitars. For example, I prefer other methods of warping against this.
Here you just need to set warp markers and move them around to stay in correct positions. We start from a raw guitar take without any slicing.
The first step is to open the warp editor in Cubase. When you double click on the recorded take, you’ll see the sample editor with many options. You should concentrate only on red selected options. The first one is the algorithm — I prefer to use “Elastique Pro — Time” for guitars and bass. Next point is to set the “Threshold” parameter to filter only that notes that we need, but remove noise. You can also set the “Beats” parameter to the max note timing played on the guitar. It will help the algorithm to create time warp markers without unnecessary notes. Almost last step is to create markers by clicking on “Create Warp Markers”. As you can see on the tab above “AudioWarp” an orange button highlighted. This indicates that your markers were created and the next step is to open that tab.
After you have created warp markers you have 2 choices — to move them manually by using the “Free Warp” tool or automatically by quantizing. I prefer to use “Free Warp” tool and move them manually because you have more control and precision in aligning markers to the grid. But in this case, I’ll show you automatic marker placement by quantizing. Open the “Quantize Panel”, click on “AudioWarp” button to select correct mode and press “Auto”. This will move markers to their correct positions on the grid. Of course, you can correct them manually if you need by using the “Free Warp” tool.
After pressing “Auto” and clicking on “Free Warp” button you will see this view. Take into account red selection — every marker that you can move is colored in orange and contains small orange triangle. Near every triangle there is a number — this shows you how much you warped you sample by markers. I don’t recommend to warp samples more than 1.10 or 0.90, because this impacts timbre of the instrument and note sound unnatural.
Therefore the final result will be like this.
This is the last technique I want to mention. The main point of this technique is to slice notes, move them using the time shift function in their events and crossfade gaps. We start from sliced view.
Next step is to select some sample and move the waveform. Pressing CMD+ALT on Mac opens special tool to time shift your sample event without any processing, that you can align it to the grid perfectly. Next step is to create a crossfade before the note to prevent any clicks or pops.
The final result should look like this. After that, you can securely bounce the recorded event.
Possible artifacts after time modification
A list of artifacts might occur when using warping:
- Phase modulation
- Transient doubling (when stretching a signal)
- Transient skipping (when compressing a signal)
- Clipping, distorted signal or pops (when there are no crossfades between samples)
As a bottom line, I can say that these techniques will help you to improve the overall quality of your tracks and make your guitars or other instruments sound more professionally 😃.
Want to know a bit more?
The most used time- and pitch-shifting algorithms in music production are made by ZPlane company. The offer different set of products including time-shifting, pitch-shifting, retuning and harmonization applications. These tools are used almost in every DAW behind the custom UI.
References & Publications
- Understanding Time Shift Algorithms For Music Producers 
- A Review of Time-Scale Modification of Music Signals — MDPI [2, PDF]
- Comparison of Elastique time stretching and pitch shifting algorithms 
- Steinberg élastique algorithms overview 
- Steinberg Standard algorithms overview 
- Time Stretching & Pitch Shifting: Comparison Part I
- Audio time stretching and pitch scaling — Wikipedia
- Guitar Pitch Shifter — Introduction
- Jens Johansson The Phase Vocoder: A Tutorial
- Time Stretching And Pitch Shifting of Audio Signals — An Overview