Freaky Assembly?

After a looong time, I was debugging some embedded C code and thought I found something freaky:

C code

for (i = 0; i < 1000000; i++);

ARM code disassembly (as generated by GNU ARM gcc)

0x0000019c <main+196>: mov r3, #0 ; 0x0
0x000001a0 <main+200>: str r3, [r11, #-16]
0x000001a4 <main+204>: b 0x1b4 <main+220>
0x000001a8 <main+208>: ldr r3, [r11, #-16]
0x000001ac <main+212>: add r3, r3, #1 ; 0x1
0x000001b0 <main+216>: str r3, [r11, #-16]
0x000001b4 <main+220>: ldr r2, [r11, #-16]
0x000001b8 <main+224>: mov r3, #999424 ; 0xf4000
0x000001bc <main+228>: add r3, r3, #572 ; 0x23c
0x000001c0 <main+232>: add r3, r3, #3 ; 0x3

0x000001c4 <main+236>: cmp r2, r3
0x000001c8 <main+240>: bls 0x1a8 <main+208>

The three highlighted lines above in effect initialize r3 with 999999: first initializes r3 with 999424, then adds 572 to it, then adds 3 to it.

What puzzled me was why couldn’t it do that directly (mov r3, #999999)?

After some scratching my head and plowing through the ARM book: ARM instructions are 32-bit — of which Operand 2 can be only 12-bits. In addition (from the ARM book):

– Of these 12 bits, 8-bits are for data, and 4-bits are used for ROR.
– The ROR bits are in turn multiplied by 2 before being applied on the 8-bits.

The combination of ROR and shifting by 2 greatly extends the range. The assembler automatically does it for you if it sees an operand greater than 8-bits.

This can be a great (but wicked) interview question (I’d never do that to anyone ;-)).

Do verify, here’s the math…

999424 + 572 + 3 is the closest tuples you can get to add up to 999999 using the 12-bit ROR with x2 multiplier for the RoR.

Just for verification, here are the instructions from memory:

1b8: 3D39A0E3 ; 0xE3A0393D
1bc: 8F3F83E2 ; 0xE2833F8F
1c0: 033083E2 ; 0xE2833003

To get 999424 (0x0F4000):
0x0000003D ROR 18 (0x9 x 2) = 0x000F4000 (ROR 18 = LSL 6)
As confirmed by the instruction: E3A03 93D

To get 572 (0x023C):
0x0000008F ROR 30 (0xF x 2) = 0x0000023C (ROR 30 = LSL 2)
As confirmed by the instruction: E2833 F8F

To get 3 (0x0003):
0x00000003 ROR 00 (0x0 x 2) = 0x00000003 (ROR 00 = LSL 0)
As confirmed by the instruction: E2833 003

Note: the LSL is just for convenience, it’s good only if data has all zeros padded on the left (at least enough to cover the LSL).

Advertisements

2 thoughts on “Freaky Assembly?

  1. alex says:

    Ha! Good find and indeed a nice interview question…

    Maybe another one would be ‘what are other optimal ways of loading over N-bit numbers into N-bit register’…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s