Bitwise operations or how I learned to stop worrying and love my compiler

Whilst writing up some of the forthcoming posts on the alarm system, I found myself trying to write a simple method of setting, clearing, and getting bits in a byte buffer. I wanted this to be as transparent as possible so that the code was easy to understand.

The AVR microcontrollers are 8-bit. This means that registers, memory, and addresses are all 8-bits wide. It surprises many Arduino users when they find out that the type “boolean” is simply an entire unsigned, 8-bit integer (uint8_t). Boolean is just not a native concept to C and embedded systems. It’s incredibly wasteful to use a whole 8 bits to represent a single boolean value.

Bitwise operations

This isn’t to say that you don’t need to set individual bits in a byte though – this is actually a very common operation. Normally to do this, you use a couple of simple constructs.

To set the 6th bit in a register (0 is the first bit):

REGISTER |= 1 << 5;

This means "Shift 0b00000001 along 5 to 0b00010000, and perform a logical OR with whatever is in the register already". This has the effect of setting bit 5 to 1, and leaving all the other bits as are.

To clear the 5th bit in a register:

REGISTER &= ~(1 << 4);

This means "Shift 0b00000001 along 4 to 0b00001000, invert it to 0b11110111, and perform a logical AND with whatever is in the register already". This has the effect of clearing bit 4, and leaving all the other bits as are.

Bytewise operations

The above is all fine if you just want to look at a single bit in a byte. What happens if I want to deal with more than 8-bits? I need more bytes!

By far the easiest way to do this is to use an array:

uint8_t array_of_bytes[32];

Which just means "create an array of 32 unsigned 8-bit integers".

I can access the first byte using:

array_of_bytes[0];

And the tenth by using:

array_of_bytes[9];

Really simple! I can combine this with the bitwise operations:

array_of_bytes[13] |= 1 << 3;
array_of_bytes[4] &= ~(1 << 6);

Now I have an easy way of accessing any bit in an array of bytes.

Bringing it together

The problem comes if I decide I want to set the bit 27 in the array. How do I work out which byte this is in?

This is just some simple maths. To work out the byte, integer divide 27 by 8 - we get 3. This means it will be in the 4th byte i.e.

array_of_bytes[3]; // indexing starts at 0!

Then take the remainder of 27 by 8 (the modulus) - we get 3 again. We need to set the 4th bit.

array_of_bytes[3] |= (1 << 3);

So now we have a method to set an individual bit in an array of bytes.

Making it easier

Rather than remember all of this maths and shifting each time I want to set a bit, why not use a macro?

#define bit_set(array,i)   array[i/8] |=  1 << (i % 8)

This is just want we have said above. % is the modulus operator in C. This looks very clean and efficient.

That is until I use it in a project and find it is incredibly slow to set each bit in a buffer one by one. Why could this be? (I'll go into benchmarking code another time)

If you look through the ATmega328P or ATmega2560 datasheets, you can see two key things:

  1. There is no hardware division unit on the microcontroller - division is expensive!
  2. There is only a single bit shift in either direction. Shifts of more than one need to be made up of multiple single bit shifts.

There's something clever we could try here though. Division by 8 is exactly the same as shifting right by 3 bits:

0b00011011 (dec 27)
shift 3 bits right
0b00000011 (dec 3)

So now we can change our macro a bit:

#define bit_set(array,i)   array[i >> 3] |=  1  (i % 8)

We can take this further though... modulus 8 is the same as only keeping the first 3 bits of a number i.e. masking.

0b00011011 (dec 27)
mask with 0b00000111
0b00000011 (dec 3)

So the macro can be changed even further:

#define bit_set(array,i)   array[i >> 3] |=  1 << (i & 0x07)

And now we recompile and try again...

And get exactly the same result as before.

It turns out avrgcc is clever enough to spot that the divide by 8 and modulus 8 can be implemented by simple shifts and masks.

What is the moral of this story? Generally these days you will not make many gains in micro-optimisation. The compiler handles all of this, and normally much better than you can. Think about the big picture to make gains!

4 thoughts on “Bitwise operations or how I learned to stop worrying and love my compiler

  1. Permalink  ⋅ Reply

    @ndy

    April 29, 2013 at 2:56pm

    It’s also dangerous to blindly “optimise” “Divide-by-a-power-of-two is a right shift”. It’s not always true. There are semantic issues around which way signed numbers are rounded. The compiler is in the best positon to know when it’s a valid optimisation! 🙂

    • Permalink  ⋅ Reply

      cybergibbons

      April 29, 2013 at 3:54pm

      Yep, indeed… I’ve actually seen code that has been optimised in such a way that the condition to decide if the optimisation is valid completely undoes any benefits the optimisation could bring.

      You also get the one person who once found an actual compiler bug. That bug is burned into their memory and now compiler errors cause 75% of bugs in their opinion… and they make sure their whole team goes by this rule.

      • Permalink  ⋅ Reply

        @ndy

        April 29, 2013 at 6:28pm

        Good, thanks. How are you? Do you still live in the Norf? I’m still in London.

Leave a Reply to cybergibbons Cancel reply

Your email will not be published. Name and Email fields are required.

This site uses Akismet to reduce spam. Learn how your comment data is processed.