Controlling overload resolution #3: Tag dispatching

Overload resolution is one of C++ most complicated things and yet it works most of the time without the need to think about it. In this mini-series, I will show you how to control this complex machinery so it is even more powerful and completely under your control.

The third post shows you the power of tag dispatching to select from multiple implementations of a (templated) function. This allows powerful optimization for types with special properties.

Motivation

For example, let’s say you have a function construct() that takes a range to uninitialized memory for an array of type T and creates default-constructed objects in it. This function can be used after a call to std::malloc() to create the actual elements in it, for example.

std::malloc() returns raw memory, you need to call the constructor of the appropriate type in it before you can use it. For that reason, it is recommended to use new instead which does it automatically.

A straightforward implementation for it can look as follows:

#include <new>

template <typename T>
void construct(T *begin, T *end)
{
	for (auto cur = begin; cur != end; ++cur)
		::new(static_cast<void*>(cur)) T(); 
}

If you have never heard of placement new the statement in the loop may look a little weird. It simply calls the default constructor of T in the memory location specified by cur. The cast to void* and the scope resolution operator :: is needed to make sure that it calls the global placement new declared in the header new and not a user-defined overload.

This simple implementation has a flaw though: It is not exception safe. If the nth constructor call throws an exception, all the previous objects have already been created and must be destroyed, but the exception is propagated and the function returns with a partly constructed range. The caller doesn’t even have the information required to destroy the constructed elements, because it does not know how many have been created!

This is bad.

Let’s fix it by putting a try-catch around the loop:

#include <new>

template <typename T>
void construct(T *begin, T *end)
{
	auto cur = begin;
	try
	{
		for (; cur != end; ++cur)
			::new(static_cast<void*>(cur)) T(); 
	}
	catch (...)
	{
		for (auto new_cur = begin; new_cur != cur; ++new_cur)
			new_cur->~T();
		throw;	
	}
}

The catch will be entered when any exception is thrown by the constructor call, it will loop over all created elements and call their destructors (unlike constructor calls, this does not require a special syntax). Then it rethrows the exception. Note that it was required to move the iteration variable outside the for loop to access it in the catch clause.

Now if the nth constructor throws an exception, all created elements will be destroyed. The function will now only return with either all elements created or none.

This is the strong exception safety guarantee.

But the try-catch version is more expensive than the one without. In addition, it is unnecessary if the default constructor of T does not throw any exceptions. And as a library author, I can do such kinds of premature optimization to squeeze the maximum performance out of it, so let’s do it.

Simplest tag dispatching - std::true_type/std::false_type

Tag dispatching is a very powerful technique to select a certain implementation of a (templated) function based on the properties of the type. It uses an additional argument - the tag, which will be passed to the function call. Based on its type the corresponding overload will be selected.

It is thus very similar to SFINAE (wonderful name!) where you want to do the same, but tag dispatching is more readable and not very complicated, so I cover it first.

In the construct() example above we have two kinds of implementations: The first one shown which can be used if the type’s default constructor does not throw any exceptions and the second one if the type does not.

The most basic tag types are std::true_type and std::false_type defined in the header type_traits, if you have only two implementations as here.

They are typedefs for std::integral_constant to a bool with the value true/false.

So let’s put them in:

#include <new>
#include <type_traits>

template <typename T>
void construct(std::true_type, T *begin, T *end)
{
	for (auto cur = begin; cur != end; ++cur)
		::new(static_cast<void*>(cur)) T(); 
}

template <typename T>
void construct(std::false_type, T *begin, T *end)
{
	auto cur = begin;
	try
	{
		for (; cur != end; ++cur)
			::new(static_cast<void*>(cur)) T(); 
	}
	catch (...)
	{
		for (auto new_cur = begin; new_cur != cur; ++new_cur)
			new_cur->~T();
		throw;	
	}
}

Note that the tag argument is unnamed, since it is only required for overload resolution.

What’s the point of it, you ask. Well, we can now select the implementation based on the tag. If we have a non-throwing constructor, we pass std::true_type as first argument, otherwise std::false_type.

That’s not very convenient though. You’d have to remember which type’s default constructor does not throw and refactor if changed. And do you know whether std::vector’s default constructor throws any exceptions?

FYI: It is marked noexcept in C++17 if the Allocator-constructor does not throw.

Enter type traits: The header <type_traits> provides a bunch of useful queries about type information. For example, std::is_nothrow_default_constructible<T> provides the member constant true if the type is nothrow default constructible (duh), otherwise the constant false. And since the member constant is inserted by inheriting from std::true_type/std::false_type, this maps exactly to our overloads!

What a coincidence.

This allows calling construct() as such:

construct(std::is_nothrow_default_constructible<std::string>{}, beg, end);

Yeah, still ugly but at least maintainable.

For that reason, the tag dispatched overloads are often called by a parent function without the tag argument, that just forwards after inserting the appropriate tag type:

template <typename T>
void construct(T *begin, T *end)
{
	construct(std::is_nothrow_default_constructible<T>{}, begin, end);
}

The tagged overloads can then be renamed or hidden inside a detail namespace.

This makes the use of tag dispatching completely transparent to the user, only the two pointers need to be passed to the function, the rest is done by magic.

Extending tags: Multiple tag arguments

But for the sake of an argument, let’s say I am still not quite happy with the construct() implementation. If you use it in generic code, it sometimes does more work than necessary. For example, constructing an int is a no-op, there is no constructor that need to be called!

Actually, calling construct() for an int will value-initialize all elements, i.e. setting them to zero. This is great, but let’s assume we don’t want that.

So for the case of int and all other types with a trivial default constructor for that matter, the body of construct can be completely empty.

Combining that with the tag dispatching for the exception, gives the following:

nothrow ctor trivial ctor implementation
true true no-op
true false first implementation w/o try-catch
false true n/a (impossible combination)
false false second implementation w/ try-catch

We now have two tag arguments for each implementation overload and check for the combination:

template <typename T>
void construct(std::true_type, std::true_type, T *, T *) {} // no-op overload

template <typename T>
void construct(std::true_type, std::false_type, T *begin, T *end)
{
	simple loop 
}

template <typename T>
void construct(std::false_type, std::false_type, T *begin, T *end)
{
	try catch loop
}

Likewise, the parent overload needs to pass two arguments:

template <typename T>
void construct(T *begin, T *end)
{
	construct(std::is_nothrow_default_constructible<T>{},
			  std::is_trivially_default_constructible<T>{},
			  begin, end);
}

Extending tags: N-ary traits

But the approach shown above is not very elegant and can easily get out of hand. A better approach would be to have n different tag types instead of multiple std::true_type/std::false_type arguments.

Too represent the three cases, we define three types like so:

struct trivial_default_ctor {};
struct nothrow_default_ctor {};
struct default_ctor {};

These are our three tag types we use to distinguish the construct() implementations. Now we write a little trait that maps a type to those tags:

template <typename T>
struct default_ctor_information // I hate to come up with those names...
{
private:
	using is_nothrow = std::is_nothrow_default_constructible<T>;
	using is_trivial = std::is_trivially_default_constructible<T>;
	
	using nothrow_conditional = typename std::conditional<is_nothrow::value, nothrow_default_ctor, default_ctor>::type;
	
public:
	using type = typename std::conditional<is_trivial::value, trivial_default_ctor, nothrow_conditional>::type;
};

This trait simply uses the same type traits and std::conditional which selects a type based on a condition. This can now be used in the parent construct() overload:

template <typename T>
void construct(T *begin, T *end)
{
	construct(typename default_ctor_information<T>::type,
			  begin, end);
}

Another advantage of this technique is that you can choose your own names for the tags which makes the implementation a lot clearer.

Tag dispatching with priority

If you look a the three tag types above, you’ll notice that there is a relationship between them. A trivial_ctor implies a nothrow_ctor which implies a default_ctor. Such kind of a relationship is represented in C++ through inheritance, so hose tag types can inherit from each other:

struct default_ctor {};
struct nothrow_default_ctor : default_ctor {};
struct trivial_default_ctor : nothrow_default_ctor {};

This has an interesting consequence: An argument of type trivial_default_ctor can now be implicitly converted to nothrow_default_ctor and default_ctor, which affects overload resolution: There is a priority chain on the overloads. As specified by the ranking of the implicit conversion sequence, the compiler will first match the type itself, then its direct baseclass, then the baseclass of the baseclass and so on.

This allows you to remove for example the no-op overload for trivial types and everything still works, overload resolution selects the overload with the direct base class - nothrow_default_ctor. Likewise for the nothrowing special case.

I’ve intentionally avoided the iterator category example, since it is the one everybody uses. All iterators have an iterator_category which is one of different tag types in a class hierachy modelled after the concept, for example a RandomAccessIterator is also a BidirectionalIterator and thus the std::random_access_iterator_tag is inherited from the std::bidirectional_iterator_tag. This allows an efficient implementation of the iterator algorithms like std::advance(). The implementation taking a std::random_access_iterator_tag will call operator+=, the one taking std::forward_iterator_tag loop calling operator++ and so on. Look at this stackoverflow answer for an example.

Conclusion

Tag dispatching is a very powerful technique that allows selecting a different implementation based on certain properties of a type. One use case is optimization if a certain set of types can do things more efficiently than a generic type.

To use tag dispatching create a set of tag types (or use predefined ones like std::true_type/std::false_type) often related through a class hierachy which is similar to the concept refinement hierachy. Each implementation takes one of the tag types as first argument. A parent overload without the tag argument selects the appropriate tag type, for example through a trait class which maps types to tags, and passes it to the implementation overloads. The magic of overload resolution will select the implementation with the right (or best fitting in case of a hierachy) tag.

In the next post of the series, I will cover an alternative to tag dispatching with different use cases: SFINAE.

This blog post was written for my old blog design and ported over. If there are any issues, please let me know.