Rust’s iterators are inefficient, and here’s what we can do about it.

When Rust can have its cake and eat it too.

Contents:

  1. The two types of iterator
  2. Can we learn from history?
  3. So why do we even care?
  4. A ray of hope
  5. Two Simple Rules
TL;DR: Don’t care for a history lesson? You can safely skip to A ray of hope.

The two types of iterator

Rust’s iterators are based on external iteration, and are thus classed as external iterators. An external iterator is one where the iteration is powered by the end user. For example, Rust’s for loop desugars roughly to

let mut _iterator = $expr.into_iter();
while let Some($pattern) = _iterator.next() {
$body
}

This style of iteration is also known as pull-based iteration, since the iterator acts as a “bag” of values that you can pull things out of. When you want another item, you can take it as long as you have access to the bag. When you stop needing items, you can just put the bag down or even throw it away.

Interestingly, Rust used to use an alternative style of iteration based on internal iteration. Unsurprisingly these are called internal iterators and the iteration is powered inside the iterator itself. Traditionally they look like

$expr.each(|$pattern| $body);

Rust had an important addition to this simple signature: $body could ask for early termination by returning False, and the method itself returned False if this ever happened, or True if the loop was exhausted with none of the iterations ever having returned False.

This style of iteration is also called push-based iteration. When you are done handling one element, the iterator is free to give you the next when it wants. If you wish to stop iterating, you need to request that the caller stop doing so; you do not control the iteration so in some sense the final verdict is not yours.

Can we learn from history?

So why did Rust move away from this style of loop? The canonical discussion can of course only come from those who made the decision, but thankfully Rust’s history is well preserved. A major turning point was a discussion on the rust-dev mailing list in June 2013.

https://mail.mozilla.org/pipermail/rust-dev/2013-June/004364.html

Problem #1: Iterators adaptors used to be hideous

The previous iterator abstraction wasn’t much of an abstraction — you just passed around higher order functions to make new higher order functions. Although this mostly worked, it produced some extremely poor APIs.

fn filter<T>(
pred: &fn(&T) -> bool,
input: &fn(&fn(T) -> bool) -> bool,
output: &fn(T) -> bool
) -> bool

The syntax here is slightly odd, since this is from a very old version of Rust. This may be nicer to read in a curried form, if you are familiar with functional languages like Haskell.

filter :: (T -> bool) -> ((T -> bool) -> bool)
-> (T -> bool) -> bool

The function (T -> bool) -> bool represents an internal iterator. The inner bool lets the closure tell the iterator whether it wants to stop iterating, and the outer bool returns whether the iterator was exhausted, or broken from early.

The filter function maps the input iterator to a new iterator, which is composed of output and the final return value from filter itself. When phrased like this, the signature makes a lot more sense!

We can thus conclude that the ugliness comes largely from the design, not the iteration style itself. A more modern implementation may look more like

fn filter<I, P>(iter: I, predicate: P) -> Filter<I, P>
where I: InternalIterator,
P: FnMut(&mut I::Item) -> bool

which, unsurprisingly, looks identical to the modern iterator protocol. The major difference is that Filter would require not a next function but an each function.

impl<I, P> InternalIterator for Filter<I, P>
where I: InternalIterator,
P: FnMut(&mut I::Item) -> bool
{
type Item = I::Item;
    fn each<F>(mut self, mut f: F) -> bool
where F: FnMut(Self::Item) -> bool
{
let Filter(mut iter, mut pred) = self;
iter.each(|elem| !pred(elem) || f(elem))
}
}

Interestingly, this implementation is shorter than the real Filter’s next function, despite doing more work! This is relatively common, but not universal, for internal iteration.

Because of the design of the old version, the ergonomic problems were pervasive. However, a modern design should not — and, as we shall see, does not — suffer in the same way.

Problem #2: The compiler can’t reason about termination

Rust’s for loop supported syntax sugar for this style of iteration, removing some of the syntactic overhead. But this causes problems for ownership, as the compiler itself cannot compile break to a jump itself — it has to pass a flag to the enclosing iterator and hope that it does the right thing. Consider this case.

let mut x = String::new();
let y;
// old style for loop
for iterator |elem| {
y = x; // can only call this once!
break;
}

This code should be fine and dandy, so we’d like the compiler to allow this code. But what if iterator was defined badly?

fn iterator(f: &fn(i32) -> bool) {
f(0); // ignore result
f(1)
}

This would cause the first break to be skipped, meaning x gets moved twice! It’s clear that this can cause big problems, so the compiler was effectively forced to consider break and even return to be unreliable control flow.

Problem #3: Internal iterators are inflexible

Internal iterators are extremely prescriptive about what kind of iteration they perform. There are ways to extend this — Rust’s original version was extended to support early returns — but the problem is largely inherent. In contrast, external iterators are flexible. In fact, every external iterator can be trivially turned into an internal iterator.

fn advance<I>(iter: I, f: F) -> bool
where I: Iterator,
F: Fn(I::Elem) -> bool
{
for elem in iter {
if !f(elem) {
return false;
}
}
true
}

Many things you take for granted with external iterators, like parallel iteration and the ability to do special handling for the first iteration of a loop, can not be expressed at all with an internal iterator.

So why do we even care?

Given all of these downsides of internal iterators, and that Rust has already made the switch, why should we care? Shouldn’t we just go celebrate instead?

The reason is that sometimes internal iterators produce faster code, often much faster. Consider a simple routine to add some numbers.

let maths = (0..big).flat_map(|x| (0..x))
.filter(|x| x % 16 < 4)
.sum();

Writing this manually gives a good reference point.

let mut maths = 0;
for x in 0..big {
for y in 0..x {
if y % 16 < 4 {
maths += y;
}
}
}

So how fast are they? For big = 10000, the iterators version took 30 to 38ms, depending on the mood of the compiler, whereas the manual code took merely 20 to 22ms — a massive reduction in overhead. Moreover, adding a second filter(|x| x != 7) caused the iterator code to take 10% longer, but barely affected the manual version at all.

An internal iterator version, the implementation of which is an exercise to the reader, reached equivalent speeds to the manual version, because when the static function calls are inlined, it produces code in largely the same style as the manual reference implementation.

There are cases where internal iterators may even win by greater margins — iterating over a BTree with external iterators often adds a lot of overhead. Some iterators are entirely unaffected, though; simple iterators over arrays compile extremely well in either model.

A ray of hope

Hopefully I have convinced you that internal iterators are a good idea, though the flexibility of external iterators, as well as the safer translation to break and return, leaves external iterators a better default.

But this sucks… we now have slow external iterators and have to use a second, new interface to get maximum speed. How do we escape this curse? It turns out there’s a really simple way, and it goes back to the definition of advance:

fn advance<I>(iter: I, f: F) -> bool
where I: Iterator,
F: Fn(I::Elem) -> bool
{
for elem in iter {
if !f(elem) {
return false;
}
}
true
}

Does this remind you of any other function? Perhaps one that’s already an iterator method? The answer, of course, is that advance is just a fancy name for all! Our each is just the all method!

Two Simple Rules

Using this new knowledge, we can now extend our best practices in several ways. If everyone agrees to these rules, lots of code should end up faster.

Firstly, Rust should implement an All trait. All::all should be implemented for all iterators as a simple loop, but permit specialization. Iterator::all, as well as other iterator methods, like fold and last, should redirect to All by default.

The reason for All::all instead of redirecting directly to Iterator::all, much like count calls fold by default, is to avoid preexisting overloads from causing loops in the call graph (eg. an implementation that overloaded all to resolve to fold).

Then we can solve everything with two simple rules.

Rule #1, for iterators

When implementing a new iterator, implement All::all manually if it proves faster. Delegate to any wrapped iterator’s underlying implementation if possible.

Rule #2, for users

When writing simple loops, iterate using a method that uses All::all, like fold or all. If the loop body is large, continue to use for.

Even if Rule #2 is not followed in normal user code, Rule #1 will result in faster code in general if common functions like Vec::from_iter, Iterator::sum and even Filter::next take advantage of it.