How do I implement assertions?

16 Sep 2016 by Jonathan

In part 1 of the series I’ve talked about various error handling strategies and when to use which one. In particular, I said that function precondition should only be checked with debug assertions, i.e. only in debug mode.

The C library provides the macro assert() for checking a condition only if NDEBUG is not defined. But as with most things coming from C, it is a simple but sometimes not sufficient solution. The biggest problem I have with it is that it is global, you either have assertions everywhere or none. This is bad, because you might not want to have assertions enabled in a library, only in your own code. For that reason, many library programmers write an assertion macro themselves, over and over again.

Instead, let’s write same thing better ourselves, but something we can easily reuse.

Source code available on GitHub.

The problems with `assert()`

While assert() does the job well, it has a couple of problems:

There is no way to specify an additional message giving more information about the failed condition, it only shows the stringified expression. This lead to hacks like assert(cond && !"my message"). An additional message is useful if the condition alone cannot give much information like assert(false). Furthermore, sometimes you need to pass other additional parameters.
It is global: Either all assertions are active or none. You cannot control assertions for a single module.
It prints an implementation defined message in an implementation defined way. You might want to control that, maybe integrating it into your logging code.
It does not support levels of assertions. Some assertions are more expensive than others, so you might want a more gradual control.
It uses a macro, a lower-case one even! Macros aren’t nice and their use should be minimized.

So let’s try to write a better assert(), in a generic way.

The first approach

This is how a first take would look like. This is probably how you write your own assertion macros:

struct source_location
{
    const char* file_name;
    unsigned line_number;
    const char* function_name;
};

#define CUR_SOURCE_LOCATION source_location{__FILE__, __LINE__, __func__}

void do_assert(bool expr, const source_location& loc, const char* expression)
{
    if (!expr)
    {
        // handle failed assertion
        std::abort();
    }
}

#if DEBUG_ASSERT_ENABLED
    #define DEBUG_ASSERT(Expr) \
        do_assert(expr, CUR_SOURCE_LOCATION, #Expr)
#else
    #define DEBUG_ASSERT(Expr)
#endif

I’ve defined a helper struct that contains information about the source location. The function do_assert() does the actual work, the macro just forwards to them.

This avoids any do ... while(0) trickery. Macros should be as small as possible.

Then we have the macro that just obtains the current source location, which is used in the actual assertion macro. Assertions can be enabled or disabled by setting the DEBUG_ASSERT_ENABLED macro.

Possible pitfall: unused variable warning

If you’ve ever compiled a release build with warnings enabled, any variable that is just used in an assertion will trigger an “unused variable”-warning.

You might decide to prevent that by writing the non-assertion like so:

#define DEBUG_ASSERT(Expr) (void)Expr

Don’t do this!

I’ve made the mistake to, it is horrible. Now the expression will be evaluated even if assertions are disabled. If the expression is sufficiently advanced, this has big performance drawbacks. Consider the following code:

iterator binary_search(iterator begin, iterator end, int value)
{
    assert(is_sorted(begin, end));
    // binary search
}

is_sorted() is a linear operation, while binary_search() is O(log n). Even if assertions are disabled, is_sorted() might still be evaluated by the compiler because it cannot prove that it doesn’t have side effects!

When I’ve made the mistake, I had a very similar situation. The performance was soo bad.

Anyways, the DEBUG_ASSERT() isn’t much better than assert(), so let’s tackle that.

Making it customizable and modular

We can actually solve both 2 and 3 with a simple addition: A policy. This is an additional template parameter that controls whether the assertion is active and how to print the message. You’d define your own Handler for each module where you want separate control over the assertions.

template <class Handler>
void do_assert(bool expr, const source_location& loc, const char* expression) noexcept
{
    if (Handler::value && !expr)
    {
        // handle failed assertion
        Handler::handle(loc, expression);
        std::abort();
    }
}

#define DEBUG_ASSERT(Expr, Handler) \
    do_assert<Handler>(Expr, CUR_SOURCE_LOCATION, #Expr)

Instead of hard coding how to handle an expression we call a static handle() function on the given Handler.

I’ve made do_assert() noexcept to prevent exceptions thrown by the Handler from leaving the function and kept the std::abort() call in case the handler function returns.

It also controls if the expression will be checked with a member constant value (like std::true_type/std::false_type). The assertion macro now unconditionally forwards to do_assert().

But this code has the same problem as described in the pitfall: It will always evaluate the expression, hell, does a branch on Handler::value!

The second problem can be solved easily, Handler::value is constant, so we can just use the emulation of constexpr if. But how do we prevent evaluation of the expression?

We make a smart trick and use a lambda:

template <class Handler, class Expr>
void do_assert(std::true_type, const Expr& e, const source_location& loc, const char* expression) noexcept
{
    if (!e())
    {
        Handler::handle(loc, expression);
        std::abort();
    }
}

template <class Handler, class Expr>
void do_assert(std::false_type, const Expr&, const source_location&, const char*) noexcept {}

template <class Handler, class Expr>
void do_assert(const Expr& e, const source_location& loc, const char* expression)
{
    do_assert<Handler>(Handler{}, e, loc, expression);
}

#define DEBUG_ASSERT(Expr, Handler) \
    do_assert<Handler>([&] { return Expr; }, CUR_SOURCE_LOCATION, #Expr)

This code now assumes that Handler inherits from std::true_type or std::false_type.

We do a “classical” tag dispatching to do a static dispatch. The more important part is the change of the expression handling: Instead of passing a bool value directly - this would mean evaluating the expression - the macro creates a lambda that returns the expression. Now the expression will be evaluated only if the lambda is called

this is done if assertions are enabled only.

The trick to wrap something in a lambda for deferred evaluation is useful for all kinds of situations like all optional checks where you don’t want a macro. In memory I use it for my double deallocation checks for example.

But does it have overhead?

The macro is always active, so it will always call the do_assert() function. This is different from conditional compilation where the macro expands to nothing. So is there some overhead?

I’ve cherry-picked some compilers from <gcc.godbolt.org>. When compiling without optimizations there is only a call to do_assert() that forwards to the no-op version. The expression will not be touched and already at the first level of optimizations is the call eliminated completely.

I wanted to improve the code generation in the case where optimizations are disabled, so I’ve switched to SFINAE to select the overload instead of tag dispatching. This prevents the need for the trampoline function that insert the tag. The macro will now call the no-op version directly. I further marked it as force-inline, so that the compiler will even inline it without optimizations. Then the only thing it does is create the source_location object.

But as before: any optimizations and it would be as if the macro expanded to nothing.

Adding assertion levels

With that approach it is very easy to add different levels of assertions:

template <class Handler, unsigned Level, class Expr>
auto do_assert(const Expr& expr, const source_location& loc, const char* expression) noexcept
-> typename std::enable_if<Level <= Handler::level>::type
{
    static_assert(Level > 0, "level of an assertion must not be 0");
    if (!expr())
    {
        Handler::handle(loc, expression);
        std::abort();
    }
}

template <class Handler, unsigned Level, class Expr>
auto do_assert(const Expr&, const source_location&, const char*) noexcept
-> typename std::enable_if<(Level > Handler::level)>::type {}

#define DEBUG_ASSERT(Expr, Handler, Level) \
    do_assert<Handler, Level>([&] { return Expr; }, CUR_SOURCE_LOCATION, #Expr)

This also switches to SFINAE instead of the tag based approach.

Instead of switching on Handler::value to determine whether assertions are activated, it now switches on the condition Level <= Handler::level. The higher the level, the more assertions are activated, a Handler::level of 0 means that no assertions are executed.

Note that this means that the minimum level of a particular assertion is 1.

The final step: Adding a message

It is very trivial to add a message to the assertion, just add an additional parameter that will be passed to the handler. But sometimes you do not want to have assertions with message, because the condition gives enough information. It would be nice to be able to overload the macro but you cannot do that. The same goes for the level, we might not want to specify it every time either. Furthermore, because the handler is generic it can take additional arguments.

So we need an assertion macro that should handle any number of arguments - a variadic macro:

template <unsigned Level>
using level = std::integral_constant<unsigned, Level>;

// overload 1, with level, enabled
template <class Expr, class Handler, unsigned Level, typename ... Args>
auto do_assert(const Expr& expr, const source_location& loc, const char* expression,
               Handler, level<Level>,
               Args&&... args) noexcept
-> typename std::enable_if<Level <= Handler::level>::type
{
    static_assert(Level > 0, "level of an assertion must not be 0");
    if (!expr())
    {
        Handler::handle(loc, expression, std::forward<Args>(args)...);
        std::abort();
    }
}

// overload 1, with level, disabled
template <class Expr, class Handler, unsigned Level, typename ... Args>
auto do_assert(const Expr&, const source_location&, const char*,
               Handler, level<Level>,
               Args&&...) noexcept
-> typename std::enable_if<(Level > Handler::level)>::type {}

// overload 2, without level, enabled
template <class Expr, class Handler, typename ... Args>
auto do_assert(const Expr& expr, const source_location& loc, const char* expression,
               Handler,
               Args&&... args) noexcept
-> typename std::enable_if<Handler::level != 0>::type
{
    if (!expr())
    {
        Handler::handle(loc, expression, std::forward<Args>(args)...);
        std::abort();
    }
}

// overload 2, without level, disabled
template <class Expr, class Handler, typename ... Args>
auto do_assert(const Expr&, const source_location&, const char*,
               Handler,
               Args&&...) noexcept
-> typename std::enable_if<Handler::level == 0>::type {}

#define DEBUG_ASSERT(Expr, ...) \
    do_assert([&] { return Expr; }, CUR_SOURCE_LOCATION, #Expr, __VA_ARGS__)

We have two parameters that must be given: the expression and the handler. Because variadic macros cannot be empty, we only name the first required parameter. All variadic parameters are passed as parameters to the function call.

This has some changes for usage: Whereas before Handler could by the type name and Level a constant, now they need to be adjusted because they are regular function parameters. Handler must be an object of the handler type and Level and object of the type level<N>. This allows argument deduction to figure the appropriate parameters out.

The above code also supports any number of additional arguments that are just forwarded to the handler function. I want to allow the following calling variants:

DEBUG_ASSERT(expr, handler{}) - no level, no additional arguments
DEBUG_ASSERT(expr, handler{}, level<4>{}) - level but no additional arguments
DEBUG_ASSERT(expr, handler{}, msg) - no level but additional argument (a message)
DEBUG_ASSERT(expr, handler{}, level<4>{}, msg) - level and additional argument (a message)

To support this we need two overloads of do_assert(). The first one handles all overloads where we have a level (2 and 4), the second one the two other cases without level (1, 3).

But it is still a macro!

One of the problems I had with assert() was that it is a macro. Yet, this is still a macro!

But it is a massive improvement: We do not need the macro to disable the assertion anymore, only for three things:

Get the current source location.
Stringify the expression.
Convert the expression to a lambda to enable delayed evaluation.

There is hope for 1.: In the library fundamentals v2 is std::experimental::source_location. This class represents a location of the source code like the struct I’ve written. But its static member function current() does compiler magic to obtain it instead of using macros. Furthermore, if you use it like so:

void foo(std::experimental::source_location loc = std::experimental::source_location::current());

loc will have the source location of the caller, not the parameter! This is exactly what is needed for stuff like assertion macros.

Sadly, we cannot replace the macro with something for 2. and 3., this must be done manually by the caller. So there is no way to get rid of the macro while keeping the flexibility.

Conclusion

We’ve written a simple assertion utility that is flexible, generic and supports per-module levels of assertions. While I was writing the post, I’ve decide to publish the code in the form of a header-only library: debug-assert.

It provides some additional code like easily generating module handlers:

struct my_module
: debug_assert::set_level<2>, // set the level, normally done via buildsystem macro
  debug_assert::default_handler // use the default handler
{};

Simply copy the header into your project to start using a new and improved assertion macro. Hopefully it can prevent you from writing an assertion macro for every single project where you need to separately control assertions. It is currently just a very small and quickly written library, if you have any ideas to improve it, let me know!

This blog post was written for my old blog design and ported over. If there are any issues, please let me know.

The problems with assert()