A Case Study In Type Extension: From all_equal to all_equal_value

8 min

Welcome back to the Rust Review series, where we continue our exploration of Rust programming with insights from some of the brightest minds in the community. In this seventh instalment, we’re pleased to welcome back Dr. Ittay Weiss, a seasoned Rust developer who previously captivated our audience with his deep understanding of Rust's intricacies.

Currently shaping the future of code at Stockly in Paris, France, Dr. Weiss brings a wealth of expertise and a passion for pushing Rust to its limits. In this latest blog, he delves into the nuances of extending types from bool to Result<_,_> in the context of success versus failure checks, offering a thought-provoking analysis that underscores the complexity and elegance of Rust programming.



A while ago I was coding at work and I needed some helper function. I suspected I would find it in the itertools crate, and I was right. However, later a colleague pointed out to me that we had a helper function in our own code base, with the exact same name, and so I could have used that one. I was surprised to find the duplication, so I looked at the code. The return type of the two functions did not agree. Something was afoot.

In this case study I present this situation, engulfed in a discussion about extending types from bool to Result<_,_> in the context of success vs. failure checks. The situation presents two competing ways to extend the method all_equal, that checks if all items in an iterator are equal, to the method all_equal_value, that also returns the common value. We shall see that pushing type principles to the limit serves as a tie-breaker and announces a winner.



All items are equal

The type bool, in the context where it is used to indicate success or failure, can be seen as a Result<(), ()>, that is as a Result type where both the expected success type and the error type are the unit type (). For instance, suppose you have an iterator and you would like to check if all the items it iterates over are equal. This is a success or failure situation: success is "all items are equal (but we do not care what the common value is)" and failure is "not all items are equal (and we do not care what exactly went wrong)". The itertools crate provides you with the all_equal method that does just that:


fn all_equal(&mut self) -> bool

    where

        Self: Sized,

        Self::Item: PartialEq,

    {

        match self.next() {

            None => true,

            Some(a) => self.all(|x| a == x),

        }

    }

which simply attempts to get the first item. If there is no such item, then the iterator is empty, and so, vacuously, all its items are equal -- returning true. Otherwise, it checks that all remaining items are equal to the first one.


Quite simple. But what if we want to provide more information than just success or failure? What if we do care about the successful outcome as well as a failure reason? Then we need to extend the return type of all_equal from bool $\cong$ Result<(), ()> to Result<T, E> for suitable types T and E. This is less simple than one might expect.



Extending bool to Result<T, E>

We must correctly choose the types T and E. Let us look at how itertools does it:


fn all_equal_value(&mut self) -> Result<Self::Item, Option<(Self::Item, Self::Item)>>

    where

        Self: Sized,

        Self::Item: PartialEq,

    {

        let first = self.next().ok_or(None)?;

        let other = self.find(|x| x != &first);

        if let Some(other) = other {

            Err(Some((first, other)))

        } else {

            Ok(first)

        }

    }

The return type is Result<Self::Item, Option<(Self::Item, Self::Item)>>. This seems to make sense: if all items are equal, then we expect to be able to return the common item; if not all items are equal, then we expect to be able to present a pair of different items.


Let us form another expectation of the extension to the Result<_,_> type: it should agree with the all_equal method, in the following sense. Recall that is_ok turns a Result<_, _> to a bool, simply indicating if it is an Ok(_) variant or not. In other words, we expect the function


fn all_equal_consistency<T: PartialEq>(things: &Vec<T>) {

    if !things.iter().all_equal_value().is_ok() == things.iter().all_equal() {

        panic!();

    }

}

to never panic. However, this is not the case:


#[test]

fn test_things() {

    let equal_things = vec![1, 1, 1];

    let different_things = vec![1, 2, 3];

    let no_things: Vec<i32> = vec![];


    assert_eq!(equal_things.iter().all_equal(), true);

    assert_eq!(different_things.iter().all_equal(), false);

    assert_eq!(no_things.iter().all_equal(), true);


    assert_eq!(equal_things.iter().all_equal_value(), Ok(&1));

    assert_eq!(

        different_things.iter().all_equal_value(),

        Err(Some((&1, &2)))

    );

    assert_eq!(no_things.iter().all_equal_value(), Err(None));


    all_equal_consistency(&equal_things);

    all_equal_consistency(&different_things);

    all_equal_consistency(&no_things);

}

All of the assertions pass, except for all_equal_consistency(&no_things);


Our expectation embodied by the function all_equal_consistency is very reasonable, and thus we are entitled to conclude that the similarity in the naming of the methods, all_equal and all_equal_value, carries risk.



What do you mean by all items are equal?

The above panic stems from interpreting the meaning of "all items are equal" in two different ways. Let us try to pinpoint that difference. The method all_equal interprets it as $\forall x, y \in \mathrm {Iter}.x=y$ (namely, every two elements in the iterator are equal). Which is the usual way. But what about all_equal_value? Can we obtain a mathematically precise formulation of what all_equal_value().is_ok() captures?


On non-empty iterators, it agrees with all_equal. Thus we only need to resolve the case of empty iterators, so we want a condition that will not be vacuously true. The following seems promising: $\exists !v:\mathrm {Iter::Item}.\forall x \in \mathrm {Iter}.x=v$ (namely, there exists a unique value v of type Iter::Item such that all items in the iterator are equal to it).


Does it work? Suppose we have an iterator over i32, and the iterator happens to be empty. Then for any value v: i32 the claim that $\forall x \in \mathrm {Iter}.x=v$ is vacuously true. It is therefore not the case that there is a unique such common value v, and thus, for the empty iterator, not all items are equal (in the second interpretation above). This seems to be what we wanted.



Revisiting the problem

We seem to have resolved the problem by pinpointing a semantic difference, showing that there are simply two different notions of "all items are equal". Therefore, let us rename all_equal to all_equal_to_each_other and we introduce a new method:


fn all_equal_to_unique_common_thing(&mut self) -> bool

    where

        Self: Sized,

        Self::Item: PartialEq,

    {

        match self.next() {

            None => false,

            Some(a) => self.all(|x| a == x),

        }

    }

These are clearly different entities, and each one can be meaningfully extended to return a Result<_,_> instead of a bool. The extended version of all_equal_to_unique_common_thing should seek to return the unique common thing. That dictates that the success type is T=Iter::Item. Failure can occur either if no pair of values exist in the iterator, or if some pair of distinct values exist. That dictates that the error type is `E=Option<(Iter::Item, Iter::Item)>. In other words, we seek to write:


fn all_equal_to_unique_common_thing_value(&mut self) -> Result<Iter::Item, Option<(Iter::Item, Iter::Item)>>

and this is exactly what the current version of all_equal_value does. So, in the spirit of clarity, let us rename all_equal_value to all_equal_to_unique_common_thing_value. We have made progress.


Extending all_equal_to_each_other, however, is a different matter. Failure occurs precisely when there exist two distinct items in the iterator, so clearly the error type is E=(Iter::Item, Iter::Item). But, upon success, there are two possibilities. If the iterator is not empty, then there is precisely one value of type Iter::Item to return, but if the iterator is empty, then any such value is 'the' common value. So success is witnessed either by a single value, or by all values.


Should the success type by T=HashSet<Iter::Item>? No, it should not. Not even if it were possible to create a hashset of all values of that type. Instead, what we want is the canonical common value in case of success, namely any one of the items in the iterator which serves as the common element. If the iterator is empty, then no such canonical item exists. It is now clear that T=Option<Iter::Item>. What we seek then is:


fn all_equal_to_each_other_canonical_value(&mut self) -> Result<Option<Iter::Item>, (Iter::Item, Iter::Item)>

    where

        Self: Sized,

        Self::Item: PartialEq,

    {

        let first = self.next();

        if first.is_none() {

            return Ok(None);

        }

        let other = self.find(|x| x != &first);

        if let Some(other) = other {

            Err(Some((first, other)))

        } else {

            Ok(first)

        }

    }


Seeking harmony

We now have four methods instead of two, and they only differ on what they do on empty iterators. Talk about much ado about nothing... Can we restore harmony?


Remember when we said that $\exists !v:\mathrm {Iter::Item}.\forall x \in \mathrm {Iter}.x=v$ looked promising as the interpretation of what all_equal_to_unique_thing means? We justified it using Iter::Item = i32, but let us now have a better look. Suppose that T is a type admitting exactly one value. An iterator of items of type T is necessarily constant, and if it is not empty, then all_equal_to_unique_thing will return true. Remember, critically, that on empty iterators it returns false, and so it is crucial that the mathematical statement evaluates to false as well. Now, for the empty iterator it is true for all values v of type Iter::Item that $\forall x\in \mathrm {Iter}.x=v$. When Iter::Item admits only one value, then it is equally true that there is a unique value v of type Iter::Item such that $\forall x\in \mathrm {Iter}.x=v$. In other words, the mathematical statement evaluates to true for the empty iterator over a type with a unique value.


Therefore, the method all_equal_to_unique_thing is not faithfully captured by that mathematical statement. Moreover, we see that the behaviour of the boolean truth value returned by the method is not a function of the iterator alone. It depends on the iterator and the type it iterates over. This is highly unpleasant, and it is ground enough to reject all_equal_to_unique_thing together with its extension all_equal_to_unique_thing_value, to rename all_equal_to_each_other back to all_equal, and to change the implementation of all_equal_value to match that of all_equal_to_each_other_canonical_value.



Acknowledgements

I would like to thank Guillaume Lagrange for pointing out the apparent duplication to me, and Thomas B (Ten0) for enlightening comments (and a few helpful suggestions).



If you're interested in becoming a blogger for the Rust Review series, please reach out to Charley Dowsell.  If you're a Rust professional seeking new opportunities or a hiring manager looking to recruit skilled Rust developers, we're here to assist. Don't hesitate to contact us for your talent needs or career aspirations.