After a looong time, I was debugging some embedded C code and thought I found something freaky:
for (i = 0; i < 1000000; i++);
ARM code disassembly (as generated by GNU ARM gcc)
0x0000019c <main+196>: mov r3, #0 ; 0x0
0x000001a0 <main+200>: str r3, [r11, #-16]
0x000001a4 <main+204>: b 0x1b4 <main+220>
0x000001a8 <main+208>: ldr r3, [r11, #-16]
0x000001ac <main+212>: add r3, r3, #1 ; 0x1
0x000001b0 <main+216>: str r3, [r11, #-16]
0x000001b4 <main+220>: ldr r2, [r11, #-16]
0x000001b8 <main+224>: mov r3, #999424 ; 0xf4000
0x000001bc <main+228>: add r3, r3, #572 ; 0x23c
0x000001c0 <main+232>: add r3, r3, #3 ; 0x3
0x000001c4 <main+236>: cmp r2, r3
0x000001c8 <main+240>: bls 0x1a8 <main+208>
The three highlighted lines above in effect initialize r3 with 999999: first initializes r3 with 999424, then adds 572 to it, then adds 3 to it.
What puzzled me was why couldn’t it do that directly (mov r3, #999999)?
After some scratching my head and plowing through the ARM book: ARM instructions are 32-bit — of which Operand 2 can be only 12-bits. In addition (from the ARM book):
– Of these 12 bits, 8-bits are for data, and 4-bits are used for ROR.
– The ROR bits are in turn multiplied by 2 before being applied on the 8-bits.
The combination of ROR and shifting by 2 greatly extends the range. The assembler automatically does it for you if it sees an operand greater than 8-bits.
This can be a great (but wicked) interview question (I’d never do that to anyone ;-)).
Do verify, here’s the math…
999424 + 572 + 3 is the closest tuples you can get to add up to 999999 using the 12-bit ROR with x2 multiplier for the RoR.
Just for verification, here are the instructions from memory:
1b8: 3D39A0E3 ; 0xE3A0393D
1bc: 8F3F83E2 ; 0xE2833F8F
1c0: 033083E2 ; 0xE2833003
To get 999424 (0x0F4000):
0x0000003D ROR 18 (0x9 x 2) = 0x000F4000 (ROR 18 = LSL 6)
As confirmed by the instruction: E3A03 93D
To get 572 (0x023C):
0x0000008F ROR 30 (0xF x 2) = 0x0000023C (ROR 30 = LSL 2)
As confirmed by the instruction: E2833 F8F
To get 3 (0x0003):
0x00000003 ROR 00 (0x0 x 2) = 0x00000003 (ROR 00 = LSL 0)
As confirmed by the instruction: E2833 003
Note: the LSL is just for convenience, it’s good only if data has all zeros padded on the left (at least enough to cover the LSL).