Wednesday 3 December 2014

When C++ templates outperform C

A colleague has recently faced me with a problem. He's writing some interesting stuff on an Arduino board which ships with an AVR microcontroller. On this sort of platform they provide header files with #define-s along the line of:

#define PORTA (*(volatile uint8_t*)0x1234)
#define PORTB (*(volatile uint8_t*)0x1235)

They're meant to be used to read and write devices ports with simple code like this:

    PORTA = 0; /* clear device register */

Also, many devices may be connected to the same port if they need just some bit instead of a whole word. So an LED can be seen as a device on bit 7 of PORTA and turned on with something like:

    PORTA |= 0x80;


His problem was something like: I'd like to have a C++ template to use as a type to declare my device on PORTx and bit N so I can just set and clear bits without worrying about bit-shifts, something like:

    device<PORTA, 7> led;
    led.set(); // or alternatively led |= 1

However, I know that 

    PORTA |= 0x80;

translates to a single assembly instruction to set just one bit along the line of

    sbi $0x1234, 7

I want that the template translates to exactly the same assembly code, so to not loose performances.





He had a good idea, I liked it and I started scribbling something down.

The first problem is that PORTA is defined as a pointer dereference and you cannot just pass it to a template class. So the first attempt would be a template taking uint8_t* and an int:

template<uint8_t* ADDRESS, int bit>
struct device {
    void set() {
        *ADDRESS |= 1 << bit;
    }
};

int main() {
    device<&PORTA, 7> led;
    led.set();
    return 0;
}

This didn't work either because of some strange reason with the ampersand and possibly the volatile keyword. After several attempts I came up with an idea: using a look up table of port addresses, then declare my device by means of the port ID (which would be the index of the look up table). The only caveat though is making sure that the optimser removes any access to the look up table at run time.

volatile uint8_t* const port_table[] = {&PORTA, &PORTB};

template<int PORT_ID>
struct port {
    volatile uint8_t* const address = port_table[PORT_ID];
    uint8_t operator |= (uint8_t v) { return *address |= v; }
};

using port_a = port<0>;
using port_b = port<1>;

template<typename PORT, int BIT>
struct device {
    PORT port;
    void set() { port |= 1 << BIT; }
};

int main() {
    device<port_a, 7> led;
    led.set();
    return 0;
}

That worked nicely !
The final assembly code contained no look up table and the sbi instruction as desired.
Note that the const keywords are required to instruct the optimiser to remove the look up table.


This opens opportunity for an awful lot of things and cool features but more importantly for safer code.
I can in fact declare a new template called read_only_device which doesn't provide write methods like set(). Any attempt to write to that particular device would fail at compile time instead of run time!
Another reason why this kind of code is safer is that there is just no way I can ever set the wrong bit on PORTA.
This is also true if my device occupies more than one bit, say bits 4-5-6 which I can use to set the device in 8 different modes. I could just write code like the following to set the device in mode 5, without me manually doing any bit-masking or bit-shifting:

device<port_a, 4, 3> device_mode;
device_mode = 5

Doing the same thing in C is just tedious and error prone. Every time I want to change mode to that device I have to read the current value from the port, mask out the bit I don't want to set and finally mask my new 3 bits back in. There are plenty of ways this can go wrong. I could still define a helper function but then I have to specify which bits to set every time, like:

set_value_with_mask(PORTA, 5, 7 << 4);

The alternative is to write a specific function for bits 4-5-6:

set_value_bits_456(PORTA, 5);

But then I have to write several functions each one for each group of bits.


I admit it. I didn't get it right the first time and took a while actually, but the end result is just astonishing I believe.

No comments:

Post a Comment