Boiethios' blog


Generic associated types in iterators

In this article, I want to explain the term Generic Associated Types through a concrete example.

I noticed that people (especially in video games development) need some tools to iterate in various manners mutably, efficiently and safely. I tried to write some convenient iterators over vectors and slices that solve those problems, but finally, I understood that some tools cannot be written with std::iter::Iterator. Doing so led me to the comprehension of generic associated types that I will abbreviate as GATs in this article. I will explain here what GATs are and why they are needed.

Let's dive in it right now and see how std::slice::Windows and std::slice::IterMut are implemented. With those examples, I will be able to explain the issue more easily.

Implementing std::slice::Windows

This iterator yields immutable and overlapping parts of a slice. The implementation is straight-forward:

pub struct Windows<'a, T:'a> {
    v: &'a [T],
    size: usize
}

impl<'a, T> Iterator for Windows<'a, T> {
    type Item = &'a [T];

    fn next(&mut self) -> Option<&'a [T]> {
        if self.size > self.v.len() {
            None
        } else {
            let ret = Some(&self.v[..self.size]);
            self.v = &self.v[1..];
            ret
        }
    }
}

The struct contains the current slice and the size of the windows (i.e. the size of each subslice yielded). At each call of next, if there are enough items, the return value is a subslice of size self.size, and then we skip one item of the inner slice.

Well, this code compiles fine, there is no problem in it. That is because the iterator yields immutable references to itself. But what if we want to return mutables references to the iterator? Let's continue to an iterator more complicated to implement: std::slice::IterMut.

Implementing std::slice::IterMut

How we would like to do

Let's try to implement it without seeing the code in the standard library:

struct IterMut<'a, T: 'a> {
    slice: &'a mut [T],
    index: usize,
}

impl<'a, T> Iterator for IterMut<'a, T> {
    type Item = &'item mut T; // What is 'item?

    fn next(&'item mut self) -> Option<&'item mut T> { // What is 'item?
        match self.slice.get_mut(self.index) {
            None => None,
            Some(ret) => {
                self.index += 1;
                Some(ret)
            }
        }
    }
}

The code should look like this: the slice has a lifetime 'a, and each time that we call next, we should borrow self during a lifetime 'item to obtain a &mut T that lives during this 'item lifetime. This lifetime 'item must be different each time we call next because the diverse items that we borrow live during their own lifetimes.

What we describe is a method generic over a lifetime:

fn next<'item>(&'item mut self) -> Option<&'item mut T>
    where 'a: 'item
{
    // etc.
}

The function next is generic over 'item and in the where clause we say that 'a must outlives 'item because the reference to the item cannot last longer than the reference to the slice. Let's compile this:

error[E0195]: lifetime parameters or bounds on method `next` do not match the trait declaration
  --> src/main.rs:9:5
   |
9  | /     fn next<'item>(&'item mut self) -> Option<&'item mut T>
10 | |         where 'a: 'item
11 | |     {
12 | |         match self.slice.get_mut(self.index) {
...  |
18 | |         }
19 | |     }
   | |_____^ lifetimes do not match method in trait

Oops, the compiler complains! Indeed, the signature of the trait method is not generic; so our method next should not be generic either to implement the trait.

How it is done in the standard library

Then, how does the standard library do to solve this issue? In fact, it uses an unsafe code to say "that's cool, don't check the lifetimes, everything will be ok". Here is the code copy-pasted from the Rust repository:

pub struct IterMut<'a, T: 'a> {
    ptr: *mut T,
    end: *mut T,
    _marker: marker::PhantomData<&'a mut T>,
}

impl<'a, T> Iterator for IterMut<'a, T> {
    type Item = &'a mut T;

    fn next(&mut self) -> Option<&'a mut T> {
        unsafe {
            if self.ptr == self.end {
                None
            } else {
                Some(make_ref_mut!(self.ptr.post_inc()))
            }
        }
    }
}

The thing is that the struct holds pointers and not references to the slice. The pointers lifetimes are not bounded to the lifetime of the item. Because of this, it is unsafe to dereference them: they can outlive the pointed datum.

With this trick, the compiler does not know (and does not care about) how the iterator is mutably borrowed. You can verify this point with this sample code that compiles fine:

fn main() {
    let mut array = [1, 2, 3];
    {
        let mut it = array.iter_mut();

        let ref_on_1 = it.next().unwrap();
        let ref_on_2 = it.next().unwrap();

        // We can hold 2 mutables reference to the same slice
        *ref_on_1 = 0;
        *ref_on_2 = 0;
    }

    assert_eq!(array, [0, 0, 3]);
}

This whole operation, however, is not unsafe, because all the references you can take from this iterator are disjoint: in fact, in memory, you have references to different objects. So the rule of no multiple mutable aliases is well satisfied.

Now, what about an iterator of mutable references to overlapping items? What if I want to implement a WindowsMut, for example?

Implementing a WindowsMut

If we want to iterate over mutable and non-disjoint references, we cannot write the same code as for Windows. Indeed, when we yield mutable references, the compiler does care about how long the iterator is borrowed (i.e. we must give an explicit lifetime to &mut self). This permits to guarantee that we will not have two mutable references to the same item.

The second solution (with unsafe code) can compile, but this is a bad thing: with this kind of code, we can hold multiple mutable references on same items:

let mut array = [1, 2, 3];
let mut it = array.windows_mut(2);

let w1 = it.next().unwrap();
let w2 = it.next().unwrap();

// Oops bad thing:
// w1[1] references the same datum as w2[0]
w1[1] = w2[0];

So, how to do this? In fact, this article is pretty disappointing, because actually, there is no way to do this with the today Rust. What we want is to specify a lifetime for &mut self and return an item bounded to this lifetime, like we already tried, but without making the function generic:

fn next(&'item mut self) -> Option<&'item mut [T]> // What is 'item?

This is not possible because the compiler must know what is item. In fact, what we want is an associated type with a generic lifetime. With this kind of feature, we could implement this kind of iterators (so called streaming iterators) without any change to the iterators API!

There is an RFC that would permit to implement the Iterator trait in this way (note the generic lifetime in Item):

impl<'a, T> Iterator for WindowsMut<'a, T> {
    type Item<'item> = &'item mut [T];

    fn next(&mut self) -> Option<Self::Item> {
        // etc.
    }
}

The compiler would understand that &mut self is bounded to the generic lifetime of Item.

While some work on GATs already happened, the feature is nowhere near completion. If you activate the highly unstable generic_associated_types feature, you will most likely get an ICE trying to use it for now.

However, the compiler will one day use Chalk for the trait logic where GATs is already implemented. You can track the work in progress of this highly awaited feature in the dedicated github issue. Like a lot of people, I hope that the GATs will come soon!


You can put a comment there

PS. I would like to give to Lukas Kalbertodt a special thank for reviewing this article!