Update: We just noticed after some testing that my method currently does wrongly assume that all SDK bitmap rows would be nicely aligned to 32-bits (like the framebuffer) which is apparently not true. So in the case where rowbytes is not a multiple of 4 it won't work as is. And it just so randomly happened that the width of Dave's example sprite IS aligned to 32 bit so I didn't notice while putting the main.c together.
I originally programmed the bitmap drawing to work with my handrolled engine and texture format where I made sure to align all rows accordingly so I didn't stumbled upon that issue earlier.