Tutorial: Emulating strong/opaque typedefs in C++

19 Oct 2016 by Jonathan

Last week, I’ve released my type_safe library. I described it’s features in the corresponding blog post but because the blog post got rather long, I couldn’t cover one feature: strong typedefs.

Strong or opaque typedefs are a very powerful feature if you want to prevent errors with the type system – and as I’ve been advocating for, you want that. Unlike “normal” typedefs, they are a true type definition: they create a new type and allow stuff like overloading on them and/or prevent implicit conversions.

Sadly, C++ doesn’t provide a native way to create them, so you have to resort to a library based emulation.

BTW, type_safe received a couple of requested features: There are improvements to the monadic optional functions (bind(), map(), unwrap() as well as a new transform()), multi-visitation of optionals and ArithmeticPolicy to control over/underflow behavior of the ts::integer<T>.

Motivation

Suppose your code has to deal with some units. Now you could employ the same technique as the excellent std::chrono library, but maybe you just need meters and kilograms and it would be overkill. To make it more clear which variables store which unit, you define some type aliases:

using meter = int;
using kilogram = int;

Instead of declaring your heights as int height, you write meter height. Everything is wonderful until you want to write a function to calculate the body mass index:

int bmi(meter height, kilogram weight);

Hours pass by, the deadline approaches and late at night you quickly need to call that function somewhere:

auto result = bmi(w, h);

You forgot the correct order of arguments, call the function incorrectly and waste a lot of time debugging.

Now, clearly a meter is not a kilogram, so it should be an error to convert those to. But the compiler does not know that, the type alias is just that: a different name for the same type. Strong typedefs can help here: They create a new type with the same properties as the original one. But there is no implicit conversions from one strong typedef type to the other one.

Let’s write them.

Doing everything manually

We can of course get strong typedefs very easily: Just write a user-defined type:

class meter
{
public:
    explicit meter(int val)
    : value_(val) {}

    explicit operator int() const noexcept
    {
        return value_;
    }

private:
    int value_;
};

We’ve created our new type meter, it is explicitly convertible to and from int. The explicit conversion from int is useful to prevent errors like:

bmi(70, 180);

Once again we messed up the parameter order but if the new types were implicitly convertible, it would work just fine. The explicit conversion to int on the other hand could be implicit. This would allow:

void func(int);
…
func(meter(5));

But I find it cleaner if you need a cast there to show your intent. Making the conversion to int explicit also prevents a lot of other things, however:

auto m1 = meter(4);
m1 += 3; // error
auto m2 = m1 - meter(3); // error
if (m2 < m1) // error
    …

Yes, that’s code is nonsense, just realized that as well. I can’t write examples.

meter is not an int, so you can’t do anything with it. You’d have to overload every operator you want to use. This is a lot of work, so nobody does that.

Luckily, C++ gives us at least a way to write that work in a library.

Modular library

The basic idea is the following: Write many “modules” that implement some functionality. Then you can write your strong typedef by defining a new class type an inherit from all the modules you want.

The basic module defines the conversion and stores the value:

template <class Tag, typename T>
class strong_typedef
{
public:
    strong_typedef() : value_()
    {
    }

    explicit strong_typedef(const T& value) : value_(value)
    {
    }

    explicit strong_typedef(T&& value)
        noexcept(std::is_nothrow_move_constructible<T>::value)
    : value_(std::move(value))
    {
    }

    explicit operator T&() noexcept
    {
        return value_;
    }

    explicit operator const T&() const noexcept
    {
        return value_;
    }

    friend void swap(strong_typedef& a, strong_typedef& b) noexcept
    {
        using std::swap;
        swap(static_cast<T&>(a), static_cast<T&>(b));
    }

private:
    T value_;
};

It provides explicit conversion to and from the underlying type as well as swap(). Copy/move ctor/assignment are implicit and the default constructor does value-initialization.

For those who don’t know the difference: If you have a strong typedef to int, the automatically generated constructor would not initialize it, while this one will initialize it with 0.

The Tag is used to differentiate between strong typedefs to the strong type, it can be just the new type itself.

Note that it does not provide any other public member, so it does not bloat the interface in any way. It also does not provide assignment from the underlying type.

With this module we can create our meter type now like so:

struct meter : strong_typedef<meter, int>
{
    // make constructors available
    using strong_typedef::strong_typedef;

    // overload required operators...
};

This module takes care of creating and storing the value, but you still need to write the interface. That’s where other modules come in. But first we need a way to get the underlying type - the interface is so minimal, it does not provide a way to get it!

But no worries, it can be made a non-member very easily. A first approach can be partial template specializations:

template <typename T>
struct underlying_type_impl;

template <typename Tag, typename T>
struct underlying_type_impl<strong_typedef<Tag, T>>
{
    using type = T;
};

template <typename T>
using underlying_type = typename underlying_type_impl<T>::type;

With partial template specializations you can decompose a type and extract it’s template arguments. But this approach does not work here because we create a new strong typedef by inheriting from the basic module. underlying_type<meter> would be ill-formed because meter inherits from strong_typedef and is not the class itself. So we need a way that allows a derived-to-base conversion - a function:

template <typename Tag, typename T>
T underlying_type_impl(strong_typedef<Tag, T>);

template <typename T>
using underlying_type
  = decltype(underlying_type_impl(std::declval<T>());

Like with partial specializations we can get the template arguments but this time it allows for implicit conversions.

Now we can write a module to implement addition for a strong typedef:

template <class StrongTypedef>
struct addition
{
    friend StrongTypedef& operator+=(StrongTypedef& lhs,
                                     const StrongTypedef& rhs)
    {
        using type = underlying_type<StrongTypedef>;
        static_cast<type&>(lhs) += static_cast<const type&>(rhs);
        return lhs;
    }

    friend StrongTypedef operator+(const StrongTypedef& lhs,
                                   const StrongTypedef& rhs)
    {
        using type = underlying_type<StrongTypedef>;
        return StrongTypedef(static_cast<const type&>(lhs)
                             + static_cast<const type&>(rhs));
    }
};

This is just a tiny class that only creates some friend functions. The problem is that we want to conditionally provide operators for our strong typedef type. An elegant way to do this is to use those friend functions. In case you didn’t know, if you write a friend function definition inside the class, the function name is not injected into the outer namespace, it is just found via ADL.

This is perfect here. We simply create friend functions in our module that overload the operator for our strong typedef type. When we inherit from the module, the friend functions are available for the derived class, but not for anything else.

Did you know that operator+= can be a non-member as well? While we can simply make it a member function of addition, this can lead to problem because it requires a conversion of this. If we combine it with the mixed_addition defined later, this will lead to an ambiguous conversion.

The approach in the module is simple: we convert both arguments to the underlying type which should provide the operator, do the operation and convert them back. This return type conversion is very important, otherwise we would be loosing our abstraction!

Then we can use our module like so:

struct meter
: strong_typedef<meter, int>, addition<meter>
{
    using strong_typedef::strong_typedef;
};

And the following code is already well-formed:

meter a(4);
meter b(5);
b += meter(1);
meter c = a + b;

But maybe we want addition with the underlying type and/or some other type? Simple, create a mixed_addition<StrongTypedef, OtherType> module and inherit from it as well.

With this approach we can create modules for all other common operator overloads. We can even create multi-modules:

template <class StrongTypedef>
struct integer_arithmetic : unary_plus<StrongTypedef>,
                            unary_minus<StrongTypedef>,
                            addition<StrongTypedef>,
                            subtraction<StrongTypedef>,
                            multiplication<StrongTypedef>,
                            division<StrongTypedef>,
                            modulo<StrongTypedef>,
                            increment<StrongTypedef>,
                            decrement<StrongTypedef>
{
};

But why not overload every operator directly?

But why are we using this modular design? Why not provide everything in the strong_typedef directly, screw the entire inheritance and write:

struct meter_tag {};

using meter = strong_typedef<meter_tag, int>;

Well, because type safety. That’s why.

The built-in type are quite general. They provide a lot of operations. But often when creating a strong typedef you add some level of semantics on top of them. And sometimes, some operations just don’t make sense!

For example, suppose you’re dealing with integer handles, like those used in APIs such as OpenGL. To prevent implicitly passing regular integers as a handle, you create a strong typedef, and imagine it would generate all the operator overloads:

struct my_handle_tag {};

using my_handle = strong_typedef<my_handle_tag, unsigned>;

Now you are able to write nonsense code like:

my_handle h;
++h; // increment a handle
h *= my_handle(5); // multiply a handle by 5
auto h2 = h / my_handle(2); // sure, divide by 2
…

You get the point.

For a handle type you do not want arithmetic! You only want equality and maybe relational comparison, but not much more.

For that reason, the basic strong_typedef module I’ve described does not create any operations, so it can be used as basis in all situations. If you want some overloads, inherit from the module or overload the operators yourself.

What about user-defined types?

Okay, now we’ve written overloads for all the common operator overloads and can create strong typedefs to integers and even iterators:

struct my_random_access_iterator
: strong_typedef<my_random_access_iterator, int*>,
  random_access_iterator<my_random_access_iterator, int>
{};

But the interfaces of some types do not consist solely of operators (citation needed). To be precise: user-defined types also have named member functions.

And this is where strong typedef emulation fails. While the operators have (reasonable) semantics and a well-defined interface, arbitrary member functions do not.

So you can’t write generic modules (usually), you’d have to bite the bullet:

struct my_new_udt
: strong_typedef<my_new_udt, udt>
{
    void foo(my_new_udt& u)
    {
        static_cast<udt&>(*this).foo(static_cast<udt&>(u));
    }

    my_new_udt bar(int i) const
    {
        return my_new_udt(static_cast<const udt&>(*this).bar(i));
    }

    my_new_udt& foobar()
    {
        auto& udt = static_cast<udt&>(*this).foobar();
        // Uhm, how am I supposed to convert it to exactly?
    }
};

This is verbose. There is no really solution to that problem either.

There is the operator.() proposal which would allow calling functions on the underlying type without knowing them, but it does not convert arguments or return types to the strong typedef type instead of the underlying.

This is exactly why we need strong typedefs as a language feature or at least some form of reflection to do this kind of work automagically. To be fair, the situation isn’t so bad, because more often than not you need a strong typedef to a built-in type and/or can add a phantom type like the Tag used in the strong_typedef here to differentiate between otherwise identical types.

But for the situations where you can’t do that, you’re screwed.

Conclusion

Strong typedefs are a great way to add more semantics to your types and catch even more errors at compile time. But they are rarely used in C++ because C++ lacks a native way to create one. While you can emulate them quite well for built-in types, using them for user defined type is very verbose, so the language really needs native support for them.

The strong typedef facility shown here is provided by type_safe. I’ve already written many modules for you, they are available in the sub-namespace strong_typedef_op. If you haven’t already, you can also check out my previous post that outlines the other features of this library.