Avoid the Temptation of Bit Fields – Codango® / Codango.Com

Introduction

Every once in a while, I come across somebody else’s blog post who’s apparently recently discovered bit fields and thinks they’re nifty in that they allow you to pack things like Boolean flags into single bits. One example was along the lines of:

struct status {
  unsigned running : 1;
  unsigned paused  : 1;
  unsigned error   : 1;
};

That is, instead of using three whole bytes (24 bits) to store three Boolean flags, you can use just three bits!

If reality, the original 24 bits would have been padded up to 32 bits and 3 bits would be padded up to at least 8. Hence, at best, you would have saved 24 bits.

While that might seem like a pretty good memory savings, there’s no such thing as a free lunch. With only a very small number of exceptions, such uses of bit fields are both misguided and actually inefficient.

Performance

Since a byte is the smallest directly addressable unit on a computer, in order to access an individual bit or set of bits like:

void set_running( struct status *s ) {
  s->running = 1;
}

the compiler has to generate code equivalent to what you would have done yourself by hand to set just one bit manually. For example, the armv8 generated code (annotated with C pseudocode) is:

ldrb    w8, [x0]     ; char w8 = *x0;
orr     w8, w8, #0x1 ; w8 |= 1;
strb    w8, [x0]     ; *x0 = w8;

That is:

Read the existing value from memory (slow).
Set the bit.
Write the updated value to memory (slow).

For a normal unsigned, you do only step 3. Hence the generated code for either reading or writing bit fields is always slower.

Other Caveats

In addition to the performance penalty, the following things are either unspecified or implementation defined when it comes to bit fields:

Whether the order of the bits is left-to-right or right-to-left. For the above example, the bits could be in the order rpe where r (for running) is the most significant bit or epr where e (for error) is.
How the bytes containing the bit fields are aligned. For the above example, they could be rpeXXXXX or XXXXXrpe.
Whether a multi-bit bit field can straddle a word boundary.
Whether a plain int bit field is signed or unsigned. Ordinarily, int is always signed. As the type of a bit field, int becomes like char in that it’s implementation defined whether it’s signed or unsigned.
Whether types other than int, signed int, unsigned int, _Bool, _BitInt(N), unsigned _BitInt(N), or _Atomic variants can be used as bit fields.

Hence, use of bit fields is extremely not portable.

Appropriate Uses

Given that bit fields are slower and not portable, when is it a good idea to use bit fields?

If you really, really need the memory savings.
If you want code clarity and the performance is inconsequential.
If you need to deal with specific hardware that uses sub-byte fields.

For saving memory, if you have other, non-bit field members in a structure, you can also often save memory by sorting members descending by size to minimize padding.

For code clarity, admittedly code like:

if ( status->error )

is simpler and thus clearer than something like:

if ( (status & ERROR_BIT) != 0 )

If you need to deal with specific hardware, you can use bit fields to map structures directly to the hardware, but you must ensure that the code your compiler generates is actually what you think it is — that is you have to know the details of what your particular implementation defines for its implementation defined behavior.

Conclusion

Unless you have a specific reason to use bit fields, don’t. Especially don’t just because you think they’re either efficient or nifty.

Introduction

Performance

Other Caveats

Appropriate Uses

Conclusion

Leave a Reply Cancel reply