Tutorial: the CRTP Interface Technique

12 Oct 2021 by Jonathan

Generic code expects that your types model certain concepts. Sometimes, the concept requires many redundant member functions in your type. A big culprit here are iterators: they require many operator overloads, most of which are trivially implemented in terms of other overloads.

CRTP, the curiously recurring template pattern, can help here and automate the boilerplate away. Let’s look at the CRTP interface technique and explore how it works.

Motivation

As motivation, consider this stable_iterator implementation. It accesses the elements of a container such as std::vector<T> via indices, instead of pointers. That way, the stable_iterator remains valid even if the container does a reallocation and moves the elements around.

template <typename Container>
class stable_iterator
{
    const Container* _container;
    std::size_t _index;

public:
    //=== Typedefs ===//
    using value_type     = typename Container::value_type;
    // for simplicity, no modification
    using reference_type = const value_type&;
    using pointer        = const value_type*;

    using difference_type   = std::ptrdiff_t;
    // for simplicity, no random access
    using iterator_category = std::forward_iterator_tag;

    //=== Constructors ===//
    // Create an invalid iterator.
    stable_iterator()
    : _container(nullptr), _index(0)
    {}

    stable_iterator(const Container& container, std::size_t idx)
    : _container(&container), _index(idx)
    {}

    //=== Access ===//
    reference_type operator*() const
    {
        assert(_container && _index < _container->size());
        return (*_container)[_index];
    }

    pointer operator->() const
    {
        // Address of reference returned by operator*().
        return &**this;
    }

    //=== Increment ===//
    stable_iterator& operator++()
    {
        assert(_container && _index < _container->size());
        ++_index;
        return *this;
    }

    stable_iterator operator++(int)
    {
        stable_iterator copy(*this);
        ++*this;
        return copy;
    }

    //=== Comparison ===//
    friend bool operator==(const stable_iterator& lhs,
                           const stable_iterator& rhs)
    {
        assert(lhs._container == rhs._container);
        return lhs._index == rhs._index;
    }

    // Not actually needed in C++20 due to operator rewrite rules.
    friend bool operator!=(const stable_iterator& lhs,
                           const stable_iterator& rhs)
    {
        return !(lhs == rhs);
    }
};

Note that the iterator is only stable if the _container pointer remains valid. If the container object itself moves around, it becomes invalid; it only guards against movement of the container’s elements.

This works, but it is quite a bit of code, especially when you consider that I’ve implemented only a forward iterator: bidirectional iterators require an additional operator--() (two overloads), and random access iterators operator+=(), operator-=(), operator+() (two overloads), operator-() (three overloads), operator[]() and the full comparison operators (four overloads, one in C++20). That is a lot of typing, especially if you need multiple iterators.

However, note that of the six member functions we wrote, operator->(), operator++(int) and operator!=() are entirely implemented in terms of operator*(), operator++(), and operator==(). Their implementation is pure boilerplate without any thought.

Let’s automate that.

Approach #1: `virtual` functions

The basic idea is to use inheritance and write a base class that injects the required boilerplate into our code. The only problem here is that we need to call functions defined in the derived class. Or to be more precise: we need to call functions whose signature is known, but whose implementation isn’t.

This is exactly what virtual functions are designed to do:

template <typename ReferenceType>
struct forward_iterator_interface
{
    // To be implemented by the derived class.
    virtual ReferenceType operator*() const = 0;
    virtual forward_iterator_interface& operator++() = 0;
    virtual bool operator==(const forward_iterator_interface& other) const = 0;

    // The boilerplate.
    auto operator->() const
    {
        return &**this; // virtual call
    }

    void operator++(int)
    {
        ++*this; // virtual call
    }

    bool operator!=(const forward_iterator_interface& rhs) const
    {
        return !(*this == rhs); // virtual call
    }
};

template <typename Container>
class stable_iterator
: public forward_iterator_interface<const typename Container::value_type&>
{

…

public:
    reference_type operator*() const override
    {
        assert(_container && _index < _container->size());
        return (*_container)[_index];
    }

    // Note: we can return the derived type here.
    stable_iterator& operator++() override
    {
        assert(_container && _index < _container->size());
        ++_index;
        return *this;
    }
    // Need to pull-in the other overload of operator++.
    using forward_iterator_interface<reference_type>::operator++;

    bool operator==(const forward_iterator_interface<reference_type>& _rhs) const override
    {
        auto& rhs = dynamic_cast<const stable_iterator&>(_rhs);
        assert(_container == rhs._container);
        return _index == rhs._index;
    }
};

I’ve omitted a virtual destructor as the base class isn’t meant to be used for dynamic polymorphism; it is an implementation detail of stable_iterator and not something clients should rely on.

This seems simple enough: we have added a base class forward_iterator_interface that declares the functions the derived class needs to implement as pure virtual members, and implemented the boilerplate by calling those functions. Note that we needed to templatize it, as the signature of operator*() (and thus operator->()) depends on the reference type of our iterator, and that we needed to switch to a member version of operator== as non-members can’t be virtual.

In the stable_iterator implementation, we inherit from the base class with the appropriate reference type and implement the required functions. Here we need a using declaration to prevent shadowing the operator++(int) overload of the base class, and a dynamic_cast to get the correct type in our operator==.

However, we couldn’t actually implement operator++(int) properly: it needs to return a copy of the derived object, which we can’t do. For starters, the only return type would be forward_iterator_interface, which is an abstract class, so can’t be returned. And even if we could do it, we would slice the base part of the object.

This problem can be solved using CRTP, where the base class is actually templatized on the derived type.

Approach #2: CRTP

The idea behind CRTP is that some base class takes the derived class as a template argument. That way, the static type of the derived class is known in the implementation of the base class. As such, we don’t actually need to use virtual functions anymore! Instead, we can statically downcast and call derived functions directly.

template <typename Derived>
struct forward_iterator_interface
{
    auto operator->() const
    {
        // Downcast ourselves to the derived type.
        auto& derived = static_cast<const Derived&>(*this);
        return &*derived; // delegate
    }

    Derived operator++(int)
    {
        auto& derived = static_cast<const Derived&>(*this);

        Derived copy(derived);
        ++derived; // delegate
        return copy;
    }

    friend bool operator!=(const Derived& rhs, const Derived& lhs)
    {
        return !(lhs == rhs); // delegate
    }
};

template <typename Container>
class stable_iterator
: public forward_iterator_interface<stable_iterator<Container>>
{

…

public:
    reference_type operator*() const
    {
        assert(_container && _index < _container->size());
        return (*_container)[_index];
    }

    stable_iterator& operator++()
    {
        assert(_container && _index < _container->size());
        ++_index;
        return *this;
    }
    // Need to pull-in the other overload of operator++.
    using forward_iterator_interface<stable_iterator>::operator++;

    friend bool operator==(const stable_iterator& lhs,
                           const stable_iterator& rhs)
    {
        assert(lhs._container == rhs._container);
        return lhs._index == rhs._index;
    }
};

In the CRTP base class, we don’t need to declare any virtual function. To call a function on Derived, all we need to do is downcast *this to the Derived type. This is perfectly safe: Derived is the derived type, so *this is actually a Derived object. If the user messes up and passes a wrong type to Derived, this is problematic, but only if that type also inherits from the CRTP base class, as seen here. If the user passes a type that does not inherit from it, the static_cast won’t compile.

As Derived is known in the base class we can directly use it in the interface to return the correct type from operator++(int), and accept the correct types in operator!= – no dynamic_cast necessary.

The implementation of stable_iterator is almost identical to the original one, but instead of writing all the boilerplate ourselves, we’ve inherited it from forward_iterator_interface. We still need the using declaration, however.

As an alternative approach, there is no need to continue using the names operator*(), operator++() and operator== in the derived class. We could name them, for example, dereference(), increment(), and equal() and implement all iterator operators in forward_iterator_interface by calling them. That way, we wouldn’t need the using declaration in the derived class.

Furthermore, forward_iterator_interface can also declare the iterator typedefs for us. They are then inherited as well so stable_iterator<Container>::iterator_category just works.

The CRTP Interface Technique

The general technique is as follows: We have some base class foo_interface that takes the derived class as template argument. It then implements some boilerplate methods by calling methods of the derived class using a downcast. The user class inherits from foo_interface and implements the required methods. It then gets the boilerplate for free.

// Definition.
template <typename Derived>
class foo_interface
{
public:
    using some_type = int;

    void do_sth_twice()
    {
        // Access the derived object.
        auto& derived = static_cast<Derived&>(*this);
        // Call a member function of the derived object.
        derived.do_sth();
        derived.do_sth();
    }

    static int get_static_value()
    {
        // Call a static member function of the derived type.
        return compute(Derived::get_static_value_impl(), 42);
    }

private:
    // You can also inject members as necessary.
    int special_value;
};

// Usage.
class my_foo
: public foo_interface<my_foo>
{
public:
    void do_sth() { … }

private:
    // Implementation helper only.
    static int get_static_value_impl() { … }

    // The interface class needs to be able to call the helper.
    friend class foo_interface<my_foo>;
};

Compared with traditional inheritance and virtual functions, the CRTP interface technique is more powerful, as it can also access types and static functions of the derived type. There also is no virtual function call overhead.

The derived type can also choose to override a default implementation of CRTP interface by simply implementing it itself. As other code uses the derived type only, it will call the new implementation, which shadows the inherited one. For example, our stable_iterator can choose to implement operator->() itself:

template <typename Container>
class stable_iterator
: public forward_iterator_interface<stable_iterator<Container>>
{
public:
    …

    // "Override" the inherited implementation of `operator->()`.
    auto operator->() const
    {
        // A potentially more efficient version or something.
        return _container->data() + _index;
    }
};

Note that code inside the CRTP base class will not automatically call the “overriden” version of a method, as name lookup is performed in its scope where no shadowing takes place. To anticipate overriding, the base class needs to qualify the call:

template <typename Derived>
class foo_interface
{
public:
    // Assume that derived classes may override this one.
    void some_method() { … }

    void use_some_method()
    {
        // This always calls the base version of `some_method()`.
        some_method();

        // This might call a derived version of `some_method()`.
        static_cast<Derived&>(*this).some_method();
    }
};

For a static method, it is enough to call it as Derived::some_method(). Then name lookup starts in Derived and might find a new implementation there.

A common issue with this technique is that the type Derived is incomplete while the class body of the base class is being parsed: accessing Derived outside of member function definitions will not compile.

template <typename Derived>
class forward_iterator_interface
{
public:
    // Error: use of incomplete type `Derived`.
    using reference = const typename Derived::value_type&;

    // Error: use of incomplete type `Derived`.
    typename Derived::pointer operator->() const
    {
        auto& derived = static_cast<const Derived&>(*this);
        // OK, inside the body `Derived` is complete.
        typename Derived::pointer ptr = &*derived;
        return ptr;
    }
};

As such, member functions of the CRTP base class might need the auto return type because the actual type simply can’t be named at that point. To access typedefs of Derived, such as value_type in the example above, an extra template parameter is needed.

template <typename Derived, typename ValueType>
class forward_iterator_interface
{
public:
    using reference = const ValueType&; // OK
};

template <typename Container>
class stable_iterator
: public forward_iterator_interface<stable_iterator<Container>,
            typename Container::value_type>
{
    …
};

Conclusion

Whenever you need to write multiple types sharing interface boilerplate, consider the CRTP interface technique instead. It allows you to implement the boilerplate once, and automatically add it to all the types via simple inheritance.

Real-world applications of this technique include:

lexy’s implementation of forward_iterator_interface (called _base there).
Boost.STLInterfaces, which contains a generic iterator_interface like we’ve implemented here (among other things).
C++20’s std::ranges::view_interface, which eliminates boilerplates for view types.
type_safe’s strong typedef implementation.

Motivation

Approach #1: virtual functions

Approach #2: CRTP

The CRTP Interface Technique

Conclusion

Approach #1: `virtual` functions