Aaron Ardiri - IoT Blog: Long live Assembly - cannot beat it for performance.

RSS/XML feed
198 entries available (show all)


PLEASE TAKE A MOMENT TO FOLLOW MY NEW VENTURE: RIoT Secure AB ALL SECURITY RELATED TOPICS ON IoT wILL BE POSTED THERE

2014-08-30	>> LONG LIVE ASSEMBLER - CANNOT BEAT IT FOR PEFORMANCE 001100010010011110100001101101110011^* I have always wondered if programmers like myself are a dying breed with the focus on high level languages such as Python, Java, C# and the variants that have been written within their runtime environments - but what about good old C and low level assembly programming? Thankfully, with the surge of IoT and the use of low powered, resource stricken micro controllers there is a great opportunity for a number of us to enjoy programming the good old fashioned way - even such that it is considered that using C is too high level. Consider the following piece of C to do a shift operation `>>` on a 256 bit byte stream: unsigned char p, val, carry; int i; p = (unsigned char )x; carry = 0; for (i=0; i<32; i++) // 32 * 8 = 256 bit { val = p[i]; p[i] = ((val >> 1) & 0x7f) \| carry; if (val & 1) carry = 0x80; else carry = 0x00; } Pretty standard stuff, go through every byte and shift one bit, apply a carry if applicable and detect if the least significant bit was set to define the carry status for the next byte. It is completely normal to assume that when targeting an 8bit Atmel based device such as the arduino - using 8bit code would be the most optimal. Unfortunately, this isn't the case as you can quite easily see by using the `-S` compile flag and reviewing the assembly produced. The C programming language requires that an `int` data type be at least 16 bits in size - so any operations requiring bit wise operations in C would be done as if they were on 16 bit integers. There are cases where you can force `gcc` to use 8 bit - using the `-mint8` compile option but it is documented that if you do, you are on your own and use at your own risk. In contrast, here is the same code in AVR assembly: movw r26, r24 ; load x within X (r27:r26) ldi r18, 32 ; i = 32 clc ; clear carry bit BigInt_shiftRight_loop: ld r23, X ; load r23 with value at X ror r23 ; rotate right r23, with carry st X+, r23 ; store r23 in X, move to next byte dec r18 ; loop until r18 is zero brne BigInt_shiftRight_loop The resulting difference? For starters, the code is much smaller and so much faster. The `ror` instruction does all the work handling the carry logic for us - and since the `dec` instruction doesn't affect the carry flag the code is kept to a pure minimum. When I updated most of my low level "BigInt" functions to assembler, I saw a speed up improvement of almost two fold. The bottom line is that when you are dealing with 8bit 16Mhz micro controllers and there is very limited space to actually write your applications, you simply need to consider assembly level programming or your code just wont cut it if you intend to do CPU intensive operations. On the other hand, it is specific to 8bit processors as the compiler optimization in `gcc` for larger sized processors is quite impressive, which has been in active development for almost every processor made - dating back to its origin back in 22 March 1987. * Futurama fans will get the first sentence in this blog post (ref: urbandictionary.com).
advertisement (self plug): need assistance in an IoT project? contact us for a free consultation.
comments powered by Disqus

Secure random number generator for the Arduino

Arduino Yún - so many possibilities

DISCLAIMER:
All content provided on this blog is for informational purposes only.
All comments are generated by users and moderated for inappropriateness periodically.
The owner will not be liable for any losses, injuries, or damages from the display or use of this information.