Mathematics behind Comparison #1: Equality and Equivalence Relations

20 Jun 2018 by Jonathan

In order to sort a collection of elements you need to provide a sorting predicate that determines when one element is less than the other. This predicate must “induce a strict total ordering on the equivalence classes” according to cppreference. Wait, what?

The upcoming C++ spaceship operator implements a three way comparison, i.e. it is a single function that can return the results of <, == and > combined. But related to it are terms like “strong equality” and “weak ordering” which are somewhat confusing if you don’t have the mathematical background.

So let’s untangle it: This series will explain both the mathematics behind equality and ordering, as well as give concrete guidelines for implementing the comparison operators and the spaceship operator.

This part covers equality and equivalence relations. What does it mean for two objects to be equal? What are the mathematical properties and C++ semantics it needs to fulfill? How do I implement proper equality comparison in C++?

In the following parts we’ll look at ordering relations, the new three way comparison and algorithms like sorting and searching on various orderings.

Basic Terminology

We want to use math to help us define the semantics of operator== and operator<. For that, we need to translate C++ into mathematics. I’m going to use (slightly adapted) terminology from Elements of Programming for that:

A value is the abstract, mathematical idea of an entity. The number 42 is a value, or the string "Hello World!". They are abstract and unchanging and we can talk about values using mathematics. Objects on the other hand are the concrete things we actually handle in C++. They store a value somewhere in memory and we can change the value they currently store. How the values are stored and which values can be stored is controlled by the type of the object. It defines two things: a set of possible values and the representation of those values in memory.

So for example int i = 42; creates a new object of type int currently holding the value 42. This is (usually) stored as the signed two’s complement of 42 using four bytes. short j = 42; also stores the value 42 but using only two bytes, so it has a different representation in memory. When we later do ++i we change the value of the object i to 43, but we didn’t change the value 42.

An operator== in C++ is a function that takes two objects of a type and returns whether or not they are equal.

In mathematics equality is some “operation” that takes two elements of a set and returns whether or not they are equal. Using the value of an object we can talk about operator== in C++ using mathematics: two objects are equal if their values are equal.

Let’s look at equality in mathematics in more detail.

Binary Relation

Equality (and comparison) are generalized as binary relations. A binary relation R over a set A is simply a set of pairs. Those are all the elements that are in relation with each other.

So for example, consider the set of colors C := {yellow, red, green, blue, cyan, magenta}. We can define a binary relation “is complement of” (or ↔) by listing all pairs of complement colors: ↔ := {(yellow, blue), (blue, yellow), (red, cyan), (cyan, red), (green, magenta), (magenta, green)}.

↔ := looks weird, but it just defines a set called ↔. You can mentally replace it by “is complement of” when reading.

If we have two elements of the set a, b ∈ A we write a R b ("a is related to b as defined by R") if (a, b) ∈ R.

So for example yellow ↔ blue because (yellow, blue) ∈ ↔.

Equivalence Relation

When talking about equality we naturally expect special properties from the binary relation:

Every element should be equal to itself. A relation with that property is called reflexive.
If a is equal to b, then b should also be equal to a. A relation with that property is symmetric.
And finally if two elements a and b are equal and b is equal to some other element c, then naturally a should be equal to c as well. A relation with that property is called transitive.

Every binary relation that is reflexive, symmetric and transitive is called an equivalence relation. Such a relation defines some kind of equality, it is a generalized form of “equal”.

Our is_complement_of relation is not an equivalence relation:

It is not reflexive: no color is the complement of itself.
It is not transitive: if we have three colors a, b, c where a ↔ b and b ↔ c, then a = c because every color has a unique complement. But a ↔ a is false because it is not reflexive.
But it is symmetric: I’ve deliberately put in every pair again with the order reversed.

And naturally the classical = of mathematics is the true equality. It is a relation defined like so: = := {(a, a) | a ∈ A}, i.e. it consists of only the pairs (a, a) for all elements of the set A. In other words: every element is equal to itself but only equal to itself.

You can verify yourself that equality is indeed an equivalence relation, i.e. that it is reflexive, symmetric and transitive.

For our color set C equality is thus defined like this = := {(yellow, yellow), (red, red), (green, green), (blue, blue), (cyan, cyan), (magenta, magenta)}.

Equality is the strictest equivalence relation you can imagine: it just barely enough to qualify as an equivalence relation, every other one must contain at least all those pairs. However, the weaker equivalence relations are useful as well. In those more elements are considered equivalent than are actually equal.

For example, we can define an equivalence relation of colors as I would see them: cyan is just an ugly blue. So I would say that, in addition to the other equalities, cyan is equivalent to blue.

Mathematically, this equivalence relation—let’s call it ≅—is this set: ≅ := {(yellow, yellow), (red, red), (green, green), (blue, blue), (cyan, cyan), (cyan, blue), (blue, cyan), (magenta, magenta)}.

I added (cyan, blue) and (blue, cyan) to the pairs we had previously. This was necessary so my relation is still symmetric (I don’t need to worry about transitive as only two distinct elements are equivalent).

Now blue ≅ blue, but also blue ≅ cyan.

Designing Equivalence Relations in C++

So far, so mathematical.

In C++ we don’t deal with sets, we deal with types. And those types only indirectly define a set, the set of their values.

For some types it is pretty straightforward what values they have. This type clearly defines the color set C from earlier:

enum class color
{
    yellow,
    red,
    green,
    blue,
    cyan,
    magenta
};

For other types it is less clear what their value actually is. Consider foo:

struct foo
{
    int* ptr;
    int size;
};

Its value could either be a pointer plus size pair, meaning foo would be like the upcoming std::span<int>. Or its value could be an array of size integers, meaning foo would be like std::vector<int>. It all depends on the additional semantics.

If you don’t know the exact value of your type this is a good indicator that you should not add a comparison for the type.

In general, there are two kinds of types in C++: You have types that are just encoding of mathematical constructs, like containers, integers, or even something like std::optional. They are usually found in libraries.

And then there are types that encode behaviors and actions, like GUI or business logic classes. Consider a button class, what is it’s value?

There is no good answer here. Sure, mathematically we can say it is a tuple of a position, label, click state, and callback, but that doesn’t really capture the essence of a button. It’s more than the sum of its part. So defining an equivalence relation on this tuple doesn’t really work.

This second category of types are types where you can’t talk about them in a mathematical way very easily. And when this can’t be done, it is also difficult to specify an equivalence relation.

If your type isn’t copyable (but only moveable), this is another indicator. It usually is some unique owner over a resource. As there is only one owner no two objects will actually be equal.

For example, the operator== of std::unique_ptr compares whether two pointers own the same pointer. If you exclude really weird deleter this will only return true if both are nullptr (or you are comparing one object to itself due to the reflexive nature of equivalence relations).

This leads to the following rule:

Rule: If you don’t know the value of your type, don’t implement an equality relation.

In particular, do not add an operator== just because you want to put your types into a hash table or use std::find(), for example. Instead provide a custom comparison predicate or use std::find_if(). Of course those need to be an equivalence relation comparing some value, the value you’re searching / want to use for lookup. But this can be a different value than the value of the entire object, we might want to lookup using the label of a button, for example.

If we have a clear value, we can define a mathematical equivalence relation on this set of value. In math it is just a set of pairs, but in C++ its a function taking two objects and returning a bool. In particular, it can either be an operator== or a named function.

When should we use which?

Rule: If you implement an equivalence relation of the values that is a true equality (i.e. values are only equal to themselves), name this function operator== and provide a matching operator!=. If you implement a weaker equivalence relation of your values (i.e. something like my color equivalence), give this function a meaningful name that is not operator==.

In other words: only implement an operator== if you are actually implementing a true equality, not some weaker equivalence. There are two reasons for that.

First is the principle of least astonishment: Users expect that your operator== returns whether two objects are truly equal not just some equivalence. Even if they don’t know the mathematics they have an intuitive grasp. Furthermore, there is only one equality but many equivalences: Why single out any single one of them and give them the special name? Giving it a special name also makes it clear which equivalence it is.

The other reason is more mathematical: Having an operator== that is a true equality means that the most functions are regular. A regular function is a function that will give you equal outputs when you call it with equal inputs.

Consider std::string as an example. A regular function of std::string is operator[]: if you call it with equal inputs (i.e. equal strings and indices), it will give you equal outputs (i.e. the same character). std::string::c_str() on the other hand is not regular: while the pointee of equal strings will be the same sequence of characters, it may point to a different memory address; the pointers are not equal.

Now consider a hypothetical ci_string. It’s just like std::string, but its operator== does a case insensitive comparison. It doesn’t implement the true equality: unequal sequence of characters can be equivalent (if they are only unequal because of different cases). But this means that operator[] is no longer a regular function:

ci_string a = "hello";
ci_string b = "HELLO";
assert(a == b);
assert(a[0] == b[0]); // fails!
// even though we're calling the function with equal inputs

If we change ci_string so that it will always convert all characters to lowercase after every modification, operator[] suddenly becomes regular. It will always return a lowercase character. But this is expected as we’ve now changed the value of the ci_string. Previously it was “sequence of characters” just like std::string. Now it is “sequence of lowercase characters” and the operator== implements the true equality.

The equality semantics depend a lot on the definition of the value of your type, which is why it is so important that you know exactly what kind of value your type has.

In the case of colors we want an operator== that implements the value equality = and a named function foonathan_thinks_its_equal() implementing ≅. For consistency, we should also add an operator!= that negates the operator== (we don’t need it for the named function).

Let’s gloss over the fact that operator== with the correct semantics is already implemented by the language.

bool operator==(color a, color b);
bool operator!=(color a, color b);

bool foonathan_thinks_its_equal(color a, color b);

Note that it can make sense to have an equivalence relation without any equality. This could be because the true equality operation is too expensive so it shouldn’t be done in an operator that might be called accidentally. Or true equality is impossible to implement, only weaker equivalence. But then you should not provide any operator== instead of making it weaker equivalence.

Implementing Equivalence Relations in C++

We’ve decided the set of values we want to model, the equivalence relation we would like to implement and the interface of the implementation. How do we write it?

Let’s tackle true equality first. Then two objects are equal if and only if their current values are equal. So how do we get from object to value?

When implementing equality operations we’re dealing with compound types, e.g. struct or class. They can have multiple properties, either directly or indirectly. The direct properties are the member variables of the type, the indirect properties are objects that can be reached from pointers which are either direct or indirect properties. Or properties are functions that compute new properties based on the value of other properties.

For example, std::vector<T> has three direct properties: The pointer to the memory, the size and the capacity. And the indirect properties are all objects in the memory it points to. But it could also have three pointers as direct properties and compute size and capacity by subtracting them. However, this is equivalent for the value of the vector.

Not all properties are part of the value of the object. For example, the value of a std::shared_ptr is the pointer it owns, not the control count, and not the indirect property, the pointee. So in order to compare two shared pointers only the pointer needs to be compared.

Of course deciding what the value of std::shared_ptr really is, is a different question. I just assumed it is the pointer it owns and not the value of the object it points to.

On the other hand for std::vector the value is the sequence of elements stored in the vector. So comparing two vector elements compares the elements, the indirect properties. It does not compare the pointer itself, but the objects it points to.

Let’s call the properties that are part of the value salient, and the other properties are non-salient. Two objects are then equal if all their salient properties are equal.

Comparing the properties is usually done with their equality but sometimes it needs to be overridden. This is most notably the case for pointers (or things behaving like pointers). Their equality is just address equality, because that is the value of a pointer. But sometimes equality of the pointees themselves is desired, so we can’t use the provided operator== but need to write custom code.

Rule: Implement equality, i.e. an operator==, by comparing the properties that actually form the value. Those can be direct members or other objects indirectly reachable from pointers.

Once we know how to implement equality, implementing a less strict equivalence relation can be done in terms of that: Just also return true for objects that are equivalent but not equal, again by comparing the properties that make up the value.

In the color case the equivalence relation looks like this:

bool foonathan_thinks_its_equal(color a, color b)
{
    if (a == b)
        // trivial case due to the reflexive property
        return true;
    else if (a == color::cyan && b == color::blue
          || a == color::blue && b == color::cyan)
        // in addition blue is equivalent to cyan
        return true;
    else
        // but no other colors are equal
        return false;
}

When you have just an equivalence relation and no equality, you can still do it. The definition of equality is then just inlined into the equivalence implementation.

Relation between Copy and Equality

Finally, I want to quickly touch on the relation between copy operations and equality: A copy operation copies the value of the object into another object, an equality operation compares two values.

This means:

Rule: Copies must always compare equal.

Furthermore, their implementation is closely related: An equality operation compares all salient properties, usually with the operator== of the property, but sometimes overriding it (e.g. to do a comparison of the pointee, not just the address of a pointer). A copy operation copies all salient properties, usually with the default copy operation of the property, but sometimes overriding it (e.g. to do a copy of the pointee, not just the pointer).

So just like we’re using the term shallow copy, e.g. types that just copy the pointers and not the pointee, we can also use the term shallow equality, e.g. types that just compare the pointers and not the pointee. On the other side we also have deep copy and deep equality.

This leads to the following rule:

Rule: If you have deep copy, you should also implement deep equality. If you have shallow copy, you should also implement shallow equality.

That way your operations are consistent and work naturally. Consider std::vector again: std::vector<T>::data() is non-salient, it is not part of the value of the vector and so isn’t preserved in a copy operation (as the copy will use new memory data() will return a different pointer). And naturally the deep equality of std::vector<T> doesn’t compare it:

std::vector<int> a = …;
std::vector<int> b = a;
assert(a == b); // succeeds
assert(a.data() == b.data()); // fails

Note that data() is not a regular function as well.

But also capacity() is non-salient: we can change it without changing the value.

b.reserve(b.capacity() * 2); // this just changes the capacity, not the elements
assert(a == b); // so they are still equal
assert(a.capacity() == b.capacity()); // but with different capacities

The actual elements are salient, when we change them, we change the value:

b.front()++; // change the value
assert(a != b); // so they are different

Rule: When changing a salient property an object is now longer equal to the object it was equal to before.

There is a type in the standard library that doesn’t quite follow those rules: std::string_view. It has shallow copy (just copies the pointers) but deep equality (compares the entire string). This means it breaks the rules of equality stated above:

std::string str = "Hello World!";

std::string_view view = str;
std::string_view copy = view;
assert(view == copy); // this is true

str[0] = 'h'; // changing a salient property (according to equality)
assert(view == copy); // but this is still true!

What is the value of std::string_view?

If you ask the copy operation it says “its value is a pointer and a size”, if you ask the equality “its value is a sequence of characters”. This dual definition of value can be confusing, but luckily its consequences are limited as std::string_view can’t modify the sequence of characters through itself and its most common uses don’t make this error possible. Read this essay on the Abseil blog for more information.

And finally, I can’t talk about equality without mentioning regular types, but this blog post is very long already. So I encourage you to go and read up on them (or just go and buy Elements of Programming).

Conclusion

Deciding about the semantics of operator== is fundamentally about deciding what the value of your objects truly is. Then you implement your copy operations so they copy the value and your comparison operators so they compare two values for the mathematical equality. If you then need to implement weaker equalities, namely equivalences, do it as named functions.

If you are not really sure what the value of your objects is, don’t define an operator==. A big sign of that is that you don’t actually have a copy operation for your type or it is not something mathematical.

This blog post was written for my old blog design and ported over. If there are any issues, please let me know.