Ultimate Overloads

Generic algorithms in C++ operate by substituting specific types into templates that use features of the underlying types. Optimized implementations of algorithms can be selected when the type parameters satisfy certain constraints. A standard technique for this is overload selection via tag dispatching, but maybe you want a more abstract solution.

As usual, code examples are available in a github gist.

As shown previously, std::enable_if can be used to select an override if types are assignable. The condition can be arbitrarily complex; taking advantage of decltype and std::declval you could even check whether a particular member function can be called and will return a particular type:

The problem is that you need something to distinguish and prioritize the different overloads, so when multiple candidates are possible the best one is selected. In this case we used int and long. The blog post Remastered enable_if covers the options well, but the final word is in a followup post Beating overload resolution into submission which presents a solution with unbounded extensibility. It’s the one at the bottom of that post, if you don’t want to read the whole thing (though you should).

All the heavy lifting was done in those posts. I want to add a little value because some things about the solution leave me discomforted:

  1. There’s a magic number 10 used to ground the template recursion, based on an assumption that no more than 10 overloads are necessary;
  2. There’s a distinct type that’s used for the “none-of-the-above” situation, which in my view is non-orthogonal;
  3. The example doesn’t work with clang: it compiles without warning, but prints “fizzbuzz” for every N.

Let’s deal with the last one first. Each overload is of the form:

but it’s clear the underlying std::enable_if hidden inside the EnableIf isn’t doing its job.

I reduced this to LLVM bug 18677, which was promptly marked as a duplicate of LLVM bug 11723, showing that the bug has been present for about two years, so it’s clearly not a priority. In fact, it’s mentioned in the fine print of Remastered. The upshot is: using a variadic parameter pack to make signatures distinct without requiring an instance is a neat trick, but if you want to be portable to clang you can’t use it.

Fortunately, the solution in Beating shows us how to get an unbounded number of unique types that form an overload hierarchy without caring whether the template parameter is unique, so all we need do is change the canonical form back to the more traditional default template parameter:

This has the added benefit (IMO) of removing the custom template alias and directly using only constructs for which the meaning is well-defined and in the standard namespace.

The other two concerns are dealt with simply by inverting the priority indicated by the unsigned template parameter: let 0 be the lowest priority, which becomes what you use for the default case. To start the recursion, just keep track of the maximum overload count required for the particular function you need to support. Here’s the whole solution:

A trivial change, but this allows us to put the overload_weight template into a library and use it for multiple functions without having to change it if somebody adds more alternatives than were anticipated.

Diagnostics for Template Meta-Programming in C++

At C++Now 2012 Marshall Clow presented Generic Programming in C++: A Real World Example which addressed the addition of a hex/unhex pair of functions to Boost.Utility. A future post may address why I think the design for this specific feature took a wrong turn right at the start, but as a pedagogical example of intermediate C++ generic programming it’s worth viewing.

The design includes an algorithm which expects a template parameter to provide certain capabilities. The original solution used std::enable_if to disable the definition when those requirements were not met. Around 00:45:00, Stephan T. Lavavej pointed out that disabling unacceptable overloads with std::enable_if produces obscure errors uninterpretable by mortal users because the compiler won’t find a match, and that a cleaner solution is an outer function with a static_assert that invokes an inner function that implements the algorithm. After a very inconvenient interruption, comment from somebody I didn’t recognize at 00:47:40 pointed out that not all compilers terminate template expansion on the static_assert failure, so using this approach you get the static assert diagnostic followed by the no-matching-function diagnostics. The commenter went on to propose a workaround where the inner function takes a bool argument, constructed in the outer function from the std::enable_if calculation, which bypasses the body if the expansion is not valid. Unfortunately the audio is unintelligible and I can’t figure out what technique was being recommended (did he say “ mpl:bool_”, “ template bool”; is the flag a template parameter or a function parameter; …).

All that’s the topic of this post. You can get the full source for the examples at this github gist.

So let’s start with a simple example. Here’s a generic algorithm that assigns one value to another:

Here’s code that invokes it, but with types that don’t satisfy the expectations of the algorithm:

And here’s the noise that GCC 4.9.0 produces in response:

That’s not something I want my users to have to cope with. Sure, it says what the problem is, but there’s a lot of detail that’s just distracting, and it’d be a lot worse with more complex types in a more complex algorithm.

So: Assume we take the original approach from the talk and disable the generic algorithm when the types are not assignable:

What that produces is not really better:

The diagnostic is shorter, and somewhat helpful because the conditional is so simple, but still obscure and indirect.

What STL appeared to propose was to add a static assert which verifies the expectations of the parameter and emits a diagnostic when they aren’t satisfied, then delegates to the original version:

This is the same technique addressed in this blog post. And, just as the anonymous commenter in the video warned, the static assert failure didn’t prevent gcc from going on to produce the non-helpful cascading SFINAE errors:

Not better.

I don’t know what the unrecognized commenter intended as the solution, but my reconstruction is the following: put the static assert in the user-called function, then delegate to a hidden overloaded implementation that provides the working algorithm only when the constraints are met, and provides a stub with no errors when they aren’t:

Walking through, in line 21 we alias template_types_ok to a type that’s equivalent to either std::true_type or std::false_type depending on whether or not the algorithm requirements are satisfied by the type parameters. In line 22 we check the satisfiability at compile-time and provide a user-level description of any failed expectation. Then line 23 we use the type that represents the satisfiability to select an implementation that won’t have compile-time errors. That one of the implementations wouldn’t work at runtime is irrelevant because it’s selected only when the static assert prevents compilation from succeeding.

Here’s what this tells the user:

Now that’s what I want my users to see if they misuse my algorithms: a clear description of what they did wrong so they can fix things.

C++ Unit Test Frameworks

One of the first decisions in a new project is which unit testing framework to use. Traditionally I’ve used CppUnit, so I pulled down the current release and started working.

This left me unhappy as the first test produced this compile-time error:

For a couple days I worked around this by casting the integer literal to a type that satisfied the calls, but eventually I got fed up.

So I looked for alternatives. I found fault with the first two choices, but joy with the third. Herein are some examples with discussion of what they reveal about the choices. The files are available as a github gist.

The Test Criteria

Three specific assertions were found to cause trouble with various solutions, so the examples used below show all of them:

  • Comparing a std::string size() with an integer literal;
  • Pointer-equality testing for char * values;
  • Comparing a floating point result to a specific absolute accuracy

In addition, these criteria are relevant:

  • Verbosity: how much boilerplate do you have to add that isn’t really part of your test?
  • Installation overhead: is it easy to build the library for specific compiler flags or is the assumption that you build it once and share it? This matters when playing with advanced language feature flags such as -std=c++1y, which can affect linking test cases together.
  • Assertion levels: when a test fails can you control whether the test keeps going or aborts (e.g., when following assertions would be invalid if the first fails).
  • Assertion comparisons: can you express specific relations (not equal, greater than) or is it mostly a true/false capability?


Originally on SourceForge, this project has developed new life at freedesktop.org.

CppUnit comes with a standard configure/make/make install build process which installs the headers and the support library into the proper directories within a toolchain prefix. You need to provide a main routine to invoke the test driver.

CppUnit provides only one level of assertion: the test case aborts when it fails. It also has limited ability to express specific requirements (for example, there is CPPUNIT_ASSERT_EQUAL(x,y) but no CPPUNIT_ASSERT_NOT_EQUAL(x,y).

Here’s what the tests looks like with CppUnit:

There’s a lot of overhead, what with the need to define and register the suites, though it didn’t really bother me until I saw what other frameworks require. And I did have to do that irritating explicit cast to get the size comparison to compile.

The output is terse and all tests pass:


Boost is a federated collection of highly-coupled but independently maintained C++ libraries covering a wide range of capabilities. It includes Boost.Test, the unit test framework used by boost developers themselves.

Boost.Test can be used as a header-only solution, but I happened to install it in library form. This gave me a default main routine for invocation, though I did have to have a separate object file with preprocessor defines which incorporated it into the executable.

Boost.Test also supports three levels of assertion. WARN is a diagnostic only; CHECK marks the test as failing but continues; and REQUIRE marks the test as failing and stops the test. There are also a wide variety of conditions ( EQUAL, NE, GT, …), each of which is supported for each level.

Here’s what the tests look like with Boost.Test:

This is much more terse than CppUnit, and seems promising. Here’s what happens when it runs:

Um. Whoops?

Boost.Test silently treats the char* pointers as though they were strings, and does a string comparison instead of a pointer comparison. Which is not what I asked for, and not what BOOST_CHECK_NE(x,y) will do with other pointer types.

Boost.Test also does not provide a mechanism for absolute difference in floating point comparison. Instead, it provides two relative solutions: BOOST_CHECK_CLOSE(v1,v2,pct) checks that v1 and v2 are no more than pct percent different (e.g. 10 would be 10% different), while BOOST_CHECK_CLOSE_FRACTION(v1,v2,frac) does the same thing but using fractions of a unit (e.g. 0.1 would be 10% different). Now, you can argue that there’s value in a relative error calculation. But to have two of them, and not have an absolute error check—that doesn’t work for me.

Boost.Test also has a few other issues. The released version has not been updated for four years, but the development version used internal to the Boost project has many changes, which are expected to be released at some point in the future. From comments on the boost developers mailing list the documentation is generally agreed to be difficult to use, and has produced a rewritten version (which, honestly, is what I had to use to try it out).

All in all, I don’t feel comfortable depending on Boost.Test.

Google Test

Google Test is another cross-platform unit test framework, which supports a companion mocking framework to support unit testing of capabilities that are not stand-alone.

The code comes with configure/make/install support, but also provides a single-file interface allowing it to be built easily within the project being tested with the same compiler and options as the code being tested. You do need a separate main routine, but it’s a two-liner to initialize the tests and run them all.

Google Test supports two levels of assertion: failure of an ASSERT aborts the test, while failure of EXPECT fails the test but continues to check additional conditions. It also provides a wide variety of conditions.

Here’s what the tests look like with Google Test:

Even more terse than Boost.Test, because it doesn’t use something like GTEST_TEST or GTEST_ASSERT_EQ. To avoid conflict with user code I normally expect framework tools to provide their interfaces within a namespace (literally for C++, or by using a standard identifier prefix where that wouldn’t work). Both CppUnit and Boost.Test do this for their macros, but for unit test code that doesn’t get incorporated into an application I think it’s ok that this isn’t done.

And here’s what you get when running it:

A little more verbose than I’m accustomed to from CppUnit, but it’s tolerable. The most important bit is the last line tells you the overall success, so you only need to scroll up if something didn’t work.


Summarizing the individual tests for each criterion, with a bold answer being preferable from my subjective perspective:

FeatureCppUnitBoost.TestGoogle Test
Handles size_t/int comparesnoyesyes
Handles char* comparesyesnoyes
Handles absolute float deltayesnoyes
Installationtoolchainheader-only or toolchainproject
Assertion Levelsonethreetwo
Assertion Conditionsfeweverymany

So I’m now happily using Google Test as the unit test framework for new C++ projects.

In fact, I’ve also started to use Google Mock, which turns out to be even more cool and eliminates the biggest limitation on unit testing: what to do if the routine being tested normally needs a heavy-weight and uncontrollable supporting infrastructure to satisfy its API needs. But I can’t really add anything beyond what you’ll can find on their wiki, so will leave it at that.

C++11 and integer rotate

About two months ago when I was starting to catch up on modern C++, I ran across John Regehr’s discussion of portable C rotate. From the initial code:

he evolves the solution to:

which generates optimal code on x86 and avoids all undefined behavior. See the original post for full details.

In C++ I’d like to generalize this to any type that supports shift operations. To do this requires understanding exactly where the original version risked undefined behavior, and where the final version does once it’s been generalized beyond uint32_t.

So here are the gotchas, with reference to the ISO/IEC 14882:2011(E) section and paragraph that discusses them.

  • Integral promotion (4.5) is performed on both shift operands (5.8#1)
  • Shift operations greater than or equal to the number of bits in the promoted left operand produce undefined behavior (section 5.8#1).  Hence the assert in the final version, and the trickery of -n&31, about which more later.
  • Shifts on signed types with negative values are undefined (5.8#2,3). Left shifts on signed types with non-negative values are undefined if the shifted value exceeds the maximum representable value in the unsigned version of the result type (colloquially, if a 1 bit is shifted out of the sign bit).
  • Integral promotion is performed on the operand to unary minus, and the result of the operation is different depending on whether the operand is unsigned (5.3.2#1).
  • Integral numbers might use a representation other than 2’s complement (3.9.1#7).

After all this is taken into account, one ends up with the following (see complete code in a test harness at this gist):

Some commentary:

  • Line 5 is a compile-time verification that the type is not a user-defined type, for which some of the other assumptions might not be valid.
  • Line 6 protects against rotation of signed values, which are known to risk undefined behavior.
  • Line 7 uses a standard-defined trait to find the number of bits in the representation of T.
  • Line 8 makes sure we’re not dealing with some weird type where an upcoming mask operation won’t produce the right answer (e.g., the MSPGCC uint20_t type).
  • Lines 9 and 10 use a bit mask to reduce the shift value to something for which it’s known the operation is defined; i.e. this function provides defined rotate behavior beyond what is mandated by C++ for shift.
  • Lines 11 and 12 deal with the possibility that the result of integral promotion of the (verified unsigned) type T might produce a signed type for which shift operations could produce undefined behavior.
  • Lines 13 and 14 implement the rotate now that all the preconditions have been validated.

And, of course, the template when instantiated for uint32_t produces the same optimal code as the original.

In meta-commentary, the addition of static_assert in C++11 is an awesome enhancement, which can be combined with std::enable_if for some neat template metaprogramming techniques that still produce comprehensible user diagnostics. The traits that provide implementation information on standard types are also a great enhancement for portable code. And the new using type alias capability makes things more readable than the equivalent typedef approach.

BTW: Somebody might suggest that the second argument be unsigned char b, since it’s reasonable to assume the shift count will be less than 256 for any integral type (though not necessarily for user-defined types). One reason not to do this is the classic argument that int is the native word size and there’s unlikely to be any benefit in using a smaller type. A second is more subtle and interesting:

  • Per 4.5#1, a prvalue of type unsigned char can promote to a prvalue of type int if representation preconditions are satisfied.
  • Per 5.3.1#8 the negation of an unsigned quantity is computed by subtracting its value from 2n where n is the number of bits in the promoted operand. The implication is that the negation of a signed quantity is computed by subtracting its value from zero.
  • While the representation of -1 in (for example) 16-bit 2’s complement is 0xFFFF, its representation in 16-bit 1’s complement is 0xFFFE and its representation in 16-bit sign-magnitude is 0x8001.

What this means is -mb&count_mask will not give you the right answer in a non-2’s-complement implementation if mb isn’t at least the same rank (4.13) as int. It also means that -mb does not produce the same value as 0-mb for all built-in integral types and processing environments.

Interesting stuff, IMO.