C++11 and integer rotate

About two months ago when I was starting to catch up on modern C++, I ran across John Regehr’s discussion of portable C rotate. From the initial code:

he evolves the solution to:

which generates optimal code on x86 and avoids all undefined behavior. See the original post for full details.

In C++ I’d like to generalize this to any type that supports shift operations. To do this requires understanding exactly where the original version risked undefined behavior, and where the final version does once it’s been generalized beyond uint32_t.

So here are the gotchas, with reference to the ISO/IEC 14882:2011(E) section and paragraph that discusses them.

  • Integral promotion (4.5) is performed on both shift operands (5.8#1)
  • Shift operations greater than or equal to the number of bits in the promoted left operand produce undefined behavior (section 5.8#1).  Hence the assert in the final version, and the trickery of -n&31, about which more later.
  • Shifts on signed types with negative values are undefined (5.8#2,3). Left shifts on signed types with non-negative values are undefined if the shifted value exceeds the maximum representable value in the unsigned version of the result type (colloquially, if a 1 bit is shifted out of the sign bit).
  • Integral promotion is performed on the operand to unary minus, and the result of the operation is different depending on whether the operand is unsigned (5.3.2#1).
  • Integral numbers might use a representation other than 2’s complement (3.9.1#7).

After all this is taken into account, one ends up with the following (see complete code in a test harness at this gist):

Some commentary:

  • Line 5 is a compile-time verification that the type is not a user-defined type, for which some of the other assumptions might not be valid.
  • Line 6 protects against rotation of signed values, which are known to risk undefined behavior.
  • Line 7 uses a standard-defined trait to find the number of bits in the representation of T.
  • Line 8 makes sure we’re not dealing with some weird type where an upcoming mask operation won’t produce the right answer (e.g., the MSPGCC uint20_t type).
  • Lines 9 and 10 use a bit mask to reduce the shift value to something for which it’s known the operation is defined; i.e. this function provides defined rotate behavior beyond what is mandated by C++ for shift.
  • Lines 11 and 12 deal with the possibility that the result of integral promotion of the (verified unsigned) type T might produce a signed type for which shift operations could produce undefined behavior.
  • Lines 13 and 14 implement the rotate now that all the preconditions have been validated.

And, of course, the template when instantiated for uint32_t produces the same optimal code as the original.

In meta-commentary, the addition of static_assert in C++11 is an awesome enhancement, which can be combined with std::enable_if for some neat template metaprogramming techniques that still produce comprehensible user diagnostics. The traits that provide implementation information on standard types are also a great enhancement for portable code. And the new using type alias capability makes things more readable than the equivalent typedef approach.

BTW: Somebody might suggest that the second argument be unsigned char b, since it’s reasonable to assume the shift count will be less than 256 for any integral type (though not necessarily for user-defined types). One reason not to do this is the classic argument that int is the native word size and there’s unlikely to be any benefit in using a smaller type. A second is more subtle and interesting:

  • Per 4.5#1, a prvalue of type unsigned char can promote to a prvalue of type int if representation preconditions are satisfied.
  • Per 5.3.1#8 the negation of an unsigned quantity is computed by subtracting its value from 2n where n is the number of bits in the promoted operand. The implication is that the negation of a signed quantity is computed by subtracting its value from zero.
  • While the representation of -1 in (for example) 16-bit 2’s complement is 0xFFFF, its representation in 16-bit 1’s complement is 0xFFFE and its representation in 16-bit sign-magnitude is 0x8001.

What this means is -mb&count_mask will not give you the right answer in a non-2’s-complement implementation if mb isn’t at least the same rank (4.13) as int. It also means that -mb does not produce the same value as 0-mb for all built-in integral types and processing environments.

Interesting stuff, IMO.

Leave a Reply

Your email address will not be published. Required fields are marked *