Implicit Boolean Conversions in C++

Implicit conversion in C++, inherited from C, is a well-known source of subtle problems. Coming across a place in the LibreOffice code base where a literal true was implicitly converted to an integer value, but where that silent adaption of type apparently did not make any sense, “Flag bogus ‘true’ with a FIXME”, I wondered whether it would not be beneficial to have a Clang plug-in that would warn about such dubious implicit conversions.

Sure it would. I wrote a Clang plug-in, implicitboolconversion, that effectively warns about (almost) all implicit conversions from bool.

One case where it does not warn is if the conversion is for an argument of integer type in a call to an extern "C" function or for a return value of integer type in the definition of an extern "C" function, as int is often used as a replacement for a Boolean type in C.

Another, similar case is mixing bool with a handful of well-known C typedefs that represent Boolean values with integer types. Most prominently in the LibreOffice code base that is sal_Bool, but also infrequently used ones like dbus_bool_t and gboolean. That is, while

  sal_Bool b = true;

technically involves an implicit conversion from bool to unsigned char (which sal_Bool is a typedef for), the plug-in will ignore that.

A third case the plug-in does not warn about is the common idiom of using &= or |= to accumulate a Boolean value, as in

  bool modified = false;
  for (int i = 0; i != N; ++i)
    modified |= updateElement(i);
  if (modified)
    notifyModification();

where both sides of the assignment need to be of Boolean type. (It also does not warn for such usages of ^=, but those are rare). Similarly, it does not warn if both sides of ==, !=, <, <=, >, => are of Boolean type, even though technically that involves implicit promotions to int and then doing a comparison of integer values.

It does warn about uses of bool in &, |, ^, though. An idiom that was surprisingly common across the LibreOffice code base was to check for inequality of two Boolean values via

  b1 ^ b2

which the plug-in now warns about, but which can just as well be written

  b1 != b2

With the plug-in I found various errors like String::Len was used in a non-bool context here” (where one use of String::Len() had erroneously been converted to !OUString::isEmpty() instead of OUString::getLength() among the myriad changes to rid us from the obsolete tools String class) and “Fix brace position” (where a trivially misplaced parentheses,

  sal_Bool bSet = nSlot == (SID_TABLE_VERT_NONE && nAlign == text::VertOrientation::NONE) || ...

instead of

  sal_Bool bSet = (nSlot == SID_TABLE_VERT_NONE && nAlign == text::VertOrientation::NONE) || ...

had gone unnoticed thanks to the unhelpful rules of C++).

This inspired me to also write a second plug-in, literaltoboolconversion, that warns about certain implicit conversions in the other direction, namely from (integer, string, etc.) literals to bool.

This immediately turned up a great amount of false positives—cases where “C macros” use literal 0 and 1 to represent Boolean values, as in

  #define X do { ... } while (0)

The heuristic I finally came up with is to not warn about uses of 0/1 and sal_False/sal_True (which are themselves macros that expand to integer literals) that are written in the body of a macro definition that appears in an include file whose name ends in “.h” (and is thus assumed to be a C include file, compared to our convention of naming C++ include files with “.hxx”). This, e.g., prevents warnings about the two literal zeros in

  #define OSL_VERIFY(c) do { if (!(c)) OSL_ASSERT(0); } while (0)

in osl/diagnose.h, but does not prevent warnings about bad macro arguments like

  OSL_VERIFY(0);

or

  OSL_VERIFY(sal_False);

that should instead be written

  OSL_VERIFY(false);

in C++ code.

(Another special case it exempts is the common idiom to use string literals in assert, like

  assert(!"this cannot happen");

to work around the shortcoming that assert does not take a message argument.)

Now, there were of course many places that, for historic reasons, did use sal_False instead of false, etc., but nothing is easier than using Clang to automatically rewrite such code.

And with all the trivial (mis-)uses of 0/1 and sal_False/sal_True out of the way, the interesting cases started to stick out. Cases like “Presumably ‘eType ==’ is missing here” (where an enum value, converting to a compile-time true was used directly as an argument to ||, instead of using it in a comparison against eType) or “Apparently broken bitmask creation” (where a logical || should rather have been a bitmask |).

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s