How To Return An Iterator From a Function

Ryan James Spencer

March 19 2020, 7:40PM

Iterators are handy ways to describe the potential for looping over data but without eagerly evaluating it. In other words, we can describe the shape of looping over things but we only do work when we call something like collect. The compiler can do several optimizations around iterators and each collect needs to allocate memory to store our results. To avoid a lot of needless allocations it pays to returns iterators sometimes from functions.

Alright, that's great but you're hitting a wall returning one from a function. I'll provide two ways you can do just that:

We could use the impl Trait syntax where Trait is the trait in question we want to return. In this case, we'd have:

fn unboxed_iterator() -> impl Iterator<Item = usize> {
  (0..3).into_iter()
}

This is where the compiler will determine the exact type that is being returned and substitute that in. This approach, albeit with static dispatch, has its limits. You can read Bodil Stokke's wonderful introduction to parser combinators as an example where the impl Trait approach starts to get too complex and require turning to our next approach of Boxing the iterators. This static approach also doesn't work when we have anonymous types, such as with async functions. Marking a function as async has support from the compiler to return the right type.

We could also use a Box. It is the simplest way to package up an iterator but at the cost of allocation. Anything behind a Box is allocated on the heap.

fn boxed_iterator() -> Box<dyn Iterator<Item = usize>> {
  Box::new((0..3).into_iter())
}

We have to pay the price of dereferencing a pointer each time we want to deal with this specific boxed iterator. We can manipulate both iterators in the same way because Box implements Deref which lets you access the methods underneath. So we can call map and friends on the resulting Box. Both of these forms can usually be used with one another, as, for example, we can chain both together since they both can be turned IntoIterators.

fn unboxed_iterator() -> impl Iterator<Item = usize> {
  (0..3).into_iter()
}

fn boxed_iterator() -> Box<dyn Iterator<Item = usize>> {
  Box::new((0..3).into_iter())
}

fn main() {
  dbg!(boxed_iterator().chain(unboxed_iterator()).collect::<Vec<_>>());
}

The above prints:

[src/main.rs:10] boxed_iterator().chain(unboxed_iterator()).collect::<Vec<_>>() = [
    0,
    1,
    2,
    0,
    1,
    2,
]

You preferably want to use the impl Trait approach as much as you can and fall back to the Box approach when that starts to fail or become inordinately slow to compile. With time this restriction may go away as work on the compiler continues.


Join the Newsletter