Balancing Robot

Botka, The Barely Standing Robot. This is one impressive balancing robot. Not even a bit of jitter. Midway through the video the thing takes on some solid whacks and still standing. For comparison: NXTway-G (the Lego Mindstorms NXT uses the Atmel AT91SAM7S ARM processor).

Botka probably uses some sophisticated PID control? fuzzy logic enhanced or a Kalman Filter? given it’s amazing response even in motion. I remember doing Kalman Filters way back in my graduate courses, pretty hairy level of mathematics, but real cool nevertheless once you got a simulation working. Never thought I’d see the daylight of that again.


ARM Assembler

My ARM assembler cheat sheet.


  1. Load-and-Store Architecture
  2. Von Neumann Architecture


T – Thumb architecture extension

  • ARM Instructions are all 32 bit
  • Thumb instructions are all 16 bit
  • Two execution states to select which instruction set to execute

D – Core has debug extensions
M – Core has enhanced multiplier
I – Core has Embedded ICE Macrocell
S – Fully synthesis able

Word = 32-bits
Half-word = 16-bits

Program Counter

The program counter is two instructions ahead. An instruction is 4 bytes, so we’re talking 8 bytes ahead. That is PC + 8. So, the net result is that the program counter is pointing the instruction being fetched, not the instruction being executed. The instruction being executed is at PC-8.

Fetch – PC
Decode – PC-4
Execute – PC-8

Interrupt Vector Table

Reset – 0x00000000
Undefined Instruction – 0x00000004
Software Interrupt – 0x00000008
Prefetch Abort – 0x0000000C
Data Abort – 0x00000010
Reserved – 0x00000014
IRQ – 0x00000018
FIQ – 0x0000001C

The entries in the Interrupt Vector Table are not the addresses of the ISR’s, but pointers to another table the VSR table (Vector Service Routine) which contains the addresses of the ISR. Why not store the ISR address directly in the Interrupt Vector Table? Because a branch instruction is limited in range to 26 bits (64MB). So, instead the IVT entry has the instruction: LDR pc, [pc,#-0xFF0]. This essentially replaces PC with value from VSR.

Example: Any IRQ causes a jump to IRQ vector (0x18)
0x18    LDR pc, [pc,#-0xFF0]  ; Loads PC with the address from VICVectAddr (0xFFFFF030) register.

In effect it does this:  LDR pc, [addr]   ; PC+8+addr

That is, -0xFF0 =  -0x00000FF0 = 0xFFFFF00F+1 = 0xFFFFF010

PC = 0x18 + 8 + -0x0FF0   ; the 8 is because PC is 8 bytes ahead always (i.e. two instructions ahead)
= 0x20 + 0xFFFFF010
= 0xFFFFF030

Exception Handling:

When an exception occurs, the core:

  1. Copies CPSR to SPSR_<mode>
  2. Sets the appropriate CPSR bits: Mode field bits (to enter IRQ mode). Set IRQ disable flag. FIQ is kept enabled to allow for nesting of FIQ over IRQ.
  3. Maps in banked registers.
  4. Stores the return address, i.e. next instruction to be executed (PC+4) in LR_<mode>
  5. Sets the PC to vector address.
  6. The instruction at the vector address is essentially an instruction that loads the exception handler’s address into the PC. The exception handler address is itself fetched from an offset. That is, the 32 byte interrupt vector block (8 interrupt vectors * 4 bytes each) is often followed immediately by a 32 byte address lookup table.

Note: In step 6, one could have the instruction to directly branch to the exception handler’s address (instead of loading the exception handler’s address into the PC), but the branch instructions support an offset of only 26 bits (64MB address range).

To return, the exception handler needs to:

  1. Restore CPSR from SPSR_<mode>
  2. Restore PC from LR_<mode>

Now step 2. is tricky:

  1. In the case of FIQ or IRQ, when an exception occurs the current instruction is discarded. So, when we return from interrupt, we don’t just restore PC from LR, but PC = LR-4, so that the discarded instruction gets re-executed. This is done by:
    SUBS R15, R14, #4    ; Restores the PC from LR, and changes the mode back to User mode.
  2. In the case of an SWI interrupt, the current instruction is not discarded, so we just simply restore PC from LR. This is done by:
    MOVS R15, R14        ; Restores the PC from LR
  3. In the case of DAbt interrupt (Data Abort), the exception occurs after the execution of the current instruction (which is the one that caused the exception), thus causing the next instruction to be discarded. So, when we return from the interrupt, we need to re-execute the instruction that caused the exception. Since the LR contained the PC+4 (i.e. the next instruction), we have to roll back to discarded instruction, plus roll back again to the instruction that caused the exception. This is done by:
    SUBS R15, R14, #8

Note in the above, special instructions (SUBS, MOVS,… – i.e. data processing instructions with S-bit set) are used to restore the PC and change the mode at the same time (when the mode changes the CPSR gets restored from SPSR). This is because if the PC is restored before the CPSR is restored (i.e. CPSR still contains the IRQ handler’s state), it will screw things up. If the CPSR is restored (i.e. operating mode is changed) before the PC is restored then the banked LR which contains the PC will be inaccessable.

Exception Handling (according to Freescale)

  1. Finish current instruction
  2. LR_irq := return link
  3. SPSR_rq := CPSR
  4. CPSR[4:0] := 0x10010   ; Enter IRQ mode
  5. CPSR[5] := 0    ; Put the processor in ARM state
  6. CPSR[7] := 1    ; Disable further interrupts
  7. PC := 0x0018    ; Jump to interrupt vector


  1. CPSR[31:28] – NZCV (Negative, Zero, Carry-over, Overflow)
  2. CPSR[7] – IRQ disable (0=enable/1=disable)
  3. CPSR[6] – FIQ disable (0=enable/1=disable)
  4. CPSR[5] – Thumb Mode (you should not set/unset this bit directly)
  5. CPSR[4:0] – operating mode (FIQ, IRQ, System, User, Undefined Instruction)

R13: Stack Pointer (SP)
R14: Link Register (LR)
R15: Program Counter (PC)


mrs r0, cpsr
orr r0,r0,#0x80
msr cpsr_c,r0
mov r0,#1
bx lr

mrs r0, cpsr
bic r0,r0,#0x80
msr cpsr_c,r0
bx lr

Subroutine Link Register

The LR (R14) stores the return address when Branch with Link operations are performed, calculated from the PC. Thus to return from a linked branch
• MOV r15,r14
• MOV pc,lr

Stack Pointer

The caller pushes the return address onto the stack.
Then calls the function.
The function pops the return address from the stack.

APCS - ARM Procedure Call Standard
    Name    Register    APCS Role

    a1      0           argument 1 / integer result / scratch register
    a2      1           argument 2 / scratch register
    a3      2           argument 3 / scratch register
    a4      3           argument 4 / scratch register

    v1      4           register variable
    v2      5           register variable
    v3      6           register variable
    v4      7           register variable
    v5      8           register variable

    sb/v6   9           static base / register variable
    sl/v7   10          stack limit / stack chunk handle / reg. variable
    fp      11          frame pointer
    ip      12          scratch register / new-sb in inter-link-unit calls
    sp      13          lower end of current stack frame
    lr      14          link address / scratch register
    pc      15          program counter

Types of Stacks

In an Empty stack, the stack pointers points to the next free (empty) location on the stack, i.e. the place where the next item to be pushed onto the stack will be stored.

In a Full stack, the stack pointer points to the topmost item in the stack, i.e. the location of the last item to be pushed onto the stack.

ARM compiler: push    {fp, ip, lr, pc}
is the same as:  STMFD sp!, {fp, ip, lr, pc}

This first pushes in the order: pc, lr, ip, fp  (i.e. PC is pushed in first, and FP last).