Raspberry Pi CCTV

After my dad passed away last year, I needed to keep an eye on mom, who’s living alone in our native home in India. So in this trip, in between going to hospitals and among other things, I spent some time setting up a remote CCTV for my native home using two Raspberry Pi‘s. This post is a note of all the steps for anyone else trying this endeavor.

  • rpi-cctv1 – is a Raspberry Pi 2 with 3 webcams + 1 wireless adapter.
  • rpi-cctv2 – is a Raspberry Pi B+ with 1 webcam + 1 wireless adapter.

The two Raspberry Pi’s sit in a DMZ with a HTTPS proxy to MotionEye (via Nginx). Having an extra router, I use a true DMZ (i.e. a real physically isolated DMZ). That is, not using the DMZ feature offered within most home routers (with those, if the DMZ gets compromised, then your whole internal LAN becomes vulnerable; here the damage is isolated to the DMZ, as it should be).

Only rpi-cctv1 is exposed to the outside via the DMZ router, the rpi-cctv2 is not accessible from outside. The motioneye on rpi-cctv1 also gets the webcam feeds from motioneye on rpi-cctv2, and provides a consolidated view of all webcams. So you only need to expose rpi-cctv1 (in my case via nginx https proxy). Likewise, to ssh into rpi-cctv2, you have to go through rpi-cctv1.

Home CCTV Network

Raspbian Image Installation

Image the SD Card

Obtain the image from https://www.raspberrypi.org/downloads/.

Assuming your USB gets mounted as /dev/sdb.

Unmount any /dev/sdb partitions and run:

sudo dd bs=4M if=2015-11-21-raspbian-jessie-lite.img of=/dev/sdb
sudo sync

Resize Partitions

  1. Unplug and plug the SD card to your workstation
  2. gparted /dev/sdb
  3. Change /dev/sdb2 volume label to “/”
  4. Resize /dev/sdb2 to 4GB
  5. Create /dev/sdb3, label /var, size 1GB, format to ext4
  6. Create /dev/sdb4, label /home, size all free space, format to ext4
  7. Unplug and plug the SD card on your workstation

Copy Files

Since we are having separate /var and /home partitions, copy files from old file system to new file systems under /var and /home.

Automount by unplugging and plugging the SD card, then run:

mv /media/spari/_/var/* /media/spari/_var/.
mv /media/spari/_/home/* /media/spari/_home/.

Reduce filesystem writes

Edit fstab to mount /var and /home. Also mount /var/tmp, /var/log, /tmp to tmpfs. Also add noatime to all SD card partitions to reduce wear and reduce chance of corruption in case of power failure.

cd /media/spari/_/etc

cat >> fstab << EOF
/dev/mmcblk0p3  /var            ext4    defaults,noatime  0       1
/dev/mmcblk0p4  /home           ext4    defaults,noatime  0       1
none            /var/log        tmpfs   defaults,nosuid,noexec,nodev,mode=0755,size=100m 0 1 
none            /tmp            tmpfs   defaults,nosuid,noexec,nodev,mode=0755,size=100m 0 1 
/tmp            /var/tmp        none    defaults,nosuid,noexec,nodev,bind 0 1 
EOF

Turn off swap

dphys-swapfile swapoff
dphys-swapfile uninstall
update-rc.d dphys-swapfile remove

Increase USB Power Limit

The specs for a Logitech VT3 Webcam rate it at around 500mA. So in order for the Raspberry Pi to handle more than one webcam, you’ll need to change a setting to increase the default USB max current from 600mA default to 1.2A. This implies upgrading the power adapter to a 5v 2.5A adapter. Any adapter less than 2.5A causes frequent brownouts.

cat >> /boot/config.txt << EOF

# Change current limit of USB ports from 0.6A to 1.2A
max_usb_current=1
EOF

Set Locales

Use raspi-config to set locales. Except keyboard locales (don’t set that via raspi-config).

Also set LC_ALL and LC_CTYPE in environment:

cat >> ~/.bashrc << EOF
export LANGUAGE=en_US.UTF-8
export LC_ALL=en_US.UTF-8
export LC_CTYPE=en_US.UTF-8
EOF

sed -i 's/XKBLAYOUT="gb"/XKBLAYOUT="us"/' /etc/default/keyboard

Additional Packages

# Non-Daemon / Non-Service Packages
apt-get install -y bc bmon bzip2 curl dstat ethtool gawk git htop ipcalc iptraf lsof lynx memstat nmap ntp openssh-server procinfo psmisc pwgen sg3-utils sysstat tcpdump tmux tree unzip usb-modeswitch vim vim-runtime zip zsh

Firewall

apt-get install -y ufw 
ufw allow ssh/tcp
ufw enable

Watchdog

Very useful to install the watchdog service which makes use of the Broadcom chip’s watchdog. Automatically reboots based on certain conditions being met (non-responsive pings, high temperature, max load average, etc).

apt-get install -y watchdog

# For some reason watch dog installs its systemd unit file with no target.
# Add the target.
sed -ri '/^\[Install\]/a\WantedBy=multi-user.target' /lib/systemd/system/watchdog.service

sed -ri 's/watchdog_module="none"/watchdog_module="bcm2708_wdog"/' /etc/default/watchdog

sed -ri 's/^#(max-load-)/\1/' /etc/watchdog.conf 
sed -ri 's/^#(watchdog-device)/\1/' /etc/watchdog.conf 

Kernel Panic

# Reboot itself 60s after a kernel panic.
echo '\nkernel.panic = 60\n' >> /etc/sysctl.conf

Setup Network

Wireless Adapter

cat >> /etc/network/interfaces << EOF
auto wlan0
allow-hotplug wlan0
iface wlan0 inet dhcp
   wpa-scan-ssid 1
   wpa-ap-scan 1
   wpa-key-mgmt WPA-PSK
   wpa-proto RSN WPA
   wpa-pairwise CCMP TKIP
   wpa-group CCMP TKIP
   wpa-ssid "wireless_ssid"
   wpa-psk "wireless_password"
EOF

Huawei 3G Dongle

If you don’t have internet access, you can use a 3G Dongle. The Huawei E8231 works just great. You’ll just have to install usb-modeswitch, to switch the mode from USB storage device (default) to Ethernet device. Installing usb-modeswitch should automatically do the job of putting the dongle in Ethernet mode. If it doesn’t (as was the case on Raspbian), then use sg3-utils to switch out of storage mode to Ethernet mode (thanks to Richard White’s post here for that). Both usb-modeswitch and sg3-utils were installed in previous section (Additional Packages).

#---------------------------------------------------------------------------
# Switch the Huawei E8231 3G dongle to Ethernet mode using sg3-utils.
#---------------------------------------------------------------------------
cat > /etc/udev/rules.d/10-HuaweiE8231.rules << EOF
SUBSYSTEMS=="usb", ATTRS{modalias}=="usb:v12D1p1F01*", SYMLINK+="hwcdrom", RUN+="/usr/bin/sg_raw /dev/hwcdrom 11 06 20 00 00 00 00 00 01 00"
EOF

Compile Decoders

Compile and install decoders required for MotionEye.

Compiling the libvpx and x264 takes about 5 min each, but the ffmpeg takes about 1.5 hours (as I let it compile all codecs).

Preparation

### Cleanup
apt-get remove --purge libmp3lame-dev libtool libssl-dev libaacplus-* libx264 libvpx librtmp ffmpeg

### Development Tools / Libs
apt-get install autoconf libtool checkinstall
apt-get install libmp3lame-dev libssl-dev

mkdir ~/src

libvpx

cd ~/src
git clone https://chromium.googlesource.com/webm/libvpx
cd libvpx/
./configure --enable-static --disable-examples --disable-unit-tests
make
sudo make install

x264

cd ~/src
git clone git://git.videolan.org/x264
cd x264/
./configure --host=arm-unknown-linux-gnueabi --enable-static --disable-opencl
make
sudo make install

ffmpeg

cd ~/src
git clone git://source.ffmpeg.org/ffmpeg.git
cd ffmpeg/
./configure --arch=armel --target-os=linux --enable-gpl --enable-libx264 --enable-nonfree
make
sudo make install

ffmpeg -codecs | grep libx264
ffmpeg -codecs | grep -i vp

Setup MotionEye

The CCTV application I’m using is MotionEye, a fantastic very professionally done UI to motion. Refer to that link for most current version and installation instructions.

#---------------------------------------------------------------------------
# Install motioneye 
#---------------------------------------------------------------------------
apt-get install -y python-pip python-dev libssl-dev libcurl4-openssl-dev libjpeg-dev
pip install motioneye

#---------------------------------------------------------------------------
# Change motioneye paths 
#---------------------------------------------------------------------------
sed -ri 's/(port) [0-9]+/\1 8000/' /etc/motioneye/motioneye.conf
sed -ri 's|(conf_path) .*|\1 /var/lib/motioneye/conf|' /etc/motioneye/motioneye.conf
sed -ri 's|(run_path) .*|\1 /var/run/motioneye|' /etc/motioneye/motioneye.conf
sed -ri 's|(log_path) .*|\1 /var/log/motioneye|' /etc/motioneye/motioneye.conf

mkdir -p /var/lib/motioneye/conf
chown -R motion.motion /var/lib/motioneye

cat >> /etc/tmpfiles.d/motioneye.conf  << EOF
D /var/log/motioneye 0755 motion motion
D /var/run/motioneye 0755 motion motion
EOF

#---------------------------------------------------------------------------
# Run motioneye as user motion
#---------------------------------------------------------------------------
sed -r '/^\[Service\]/a\User=motion\nGroup=motion' /etc/systemd/system/motioneye.service
systemctl daemon-reload
systemctl enable motioneye.service

#---------------------------------------------------------------------------
# Firewall
#---------------------------------------------------------------------------
ufw allow 8000/tcp

Setup Nginx

Didn’t like the idea of exposing my home (even if in a DMZ) to the world with basic authentication over HTTP. Fronted MotionEye with a Nginx HTTPS proxy.

#---------------------------------------------------------------------------
# Install and configure nginx as https proxy for motioneye 
#---------------------------------------------------------------------------
apt-get update
apt-get install nginx
cat >> /etc/nginx/sites-available/motioneye-nginx.conf << EOF
server {
    listen 443 ssl default_server;
    listen [::]:443 ssl default_server;

    root /home/motioneye;

    server_name gsfamily.duckdns.org;
    ssl_certificate /etc/nginx/ssl/nginx.crt;
    ssl_certificate_key /etc/nginx/ssl/nginx.key;

    location / {
        proxy_pass http://localhost:8000/;
        proxy_read_timeout 120s;
        access_log off;

        auth_basic "Restricted";
        auth_basic_user_file /etc/nginx/.htpasswd;
    }
}
EOF

# Enable motioneye site
rm /etc/nginx/sites-enabled/default
ln -s /etc/nginx/sites-available/motioneye-nginx.conf /etc/nginx/sites-enabled/motioneye-nginx.conf

# Generate SSL certs
mkdir /etc/nginx/ssl
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /etc/nginx/ssl/nginx.key -out /etc/nginx/ssl/nginx.crt

# Firewall
ufw delete allow 8000/tcp
ufw allow 443/tcp
ufw status

# Setup nginx log dir
cat >> /etc/tmpfiles.d << EOF
D /var/log/nginx 0755 root root
EOF

Setup Nginx Users

htpasswd -c /etc/nginx/.htpasswd user1
htpasswd /etc/nginx/.htpasswd user2
htpasswd /etc/nginx/.htpasswd user3

Fan Control

Fan Control

If you don’t want your Raspberry Pi case fan whirling 24×7 all year around, you can for just a few cents (cost of a 2N2222A + 4.7k resistor should be under $1) and a bit of scripting, put together a fan control based on CPU temperature reading.

On the right is a simple circuit to turn on/off a Raspberry Pi 5v case fan using a GPIO pin.

I’m using a 2N2222A. You can any other general purpose transistors, like a 2N2053 — which was my default goto transistor long back, easily available locally. The 2N2222A has a 800mA collector current limit, which should be more than enough to handle a case fan.

fan_control.py

[this section updated 09/2016]

A simple python script to read the on-chip CPU temperature sensor and turn/off fan accordingly. I’m using GPIO 2 as the fan control pin. Using wiringpi module.

Note the code uses wiring pi pin out (not the Raspberry Pi pinout or Broadcom pinout).

#!/usr/bin/python -u

import wiringpi
import time
# WPI PIN 8 = RPI Header Pin 3 = BCM GPIO 2 
FAN_PIN = 8

# RPI 2: 36-46C
# RPI 3: 47-51C (no cameras), 51-63C (3 cameras, idle)
HIGH_TEMP = 46 
LOW_TEMP = 36 

POLL_INTERVAL = 30

cpu_temp_dev = '/sys/class/thermal/thermal_zone0/temp'

def cpu_temp():
    with open(cpu_temp_dev,'r') as f:
        return float(f.readline())/1000

io = wiringpi.GPIO(wiringpi.GPIO.WPI_MODE_PINS)
io.pinMode(FAN_PIN, io.OUTPUT)
io.digitalWrite(FAN_PIN, io.LOW)

while (True):
    temp = cpu_temp()
    print 'CPU_TEMP: {:0.2f}C'.format(temp)

    if temp > HIGH_TEMP:
        print "Fan: On"
        io.digitalWrite(FAN_PIN, io.HIGH)
    elif temp < LOW_TEMP:
        io.digitalWrite(FAN_PIN, io.LOW)
        print "Fan: Off"

    time.sleep(POLL_INTERVAL)

Install WiringPi

# Wiringpi
apt-get install wiringpi
apt-get install -y python-pip python-dev
pip install wiringpi2

fan_control service

#---------------------------------------------------------------------------
# Create systemd service for fan_control
#---------------------------------------------------------------------------
cat >> /etc/systemd/system/fan_control.service << EOF
[Unit]
Description=Fan Control
After=syslog.target

[Service]
ExecStart=/home/sysadmin/bin/fan_control.py

[Install]
WantedBy=multi-user.target
EOF

systemctl enable fan_control.service

Setup Dynamic DNS

If you don’t have a static IP, then you can use a dynamic DNS. I use http://duckdns.org.

Most routers should have a feature for providing your dynamic DNS provider URL, but the D-Link router I’m using allows for only 3 possible providers — and all 3 are hard-coded, none of which is duckdns.org.

So, I’m putting the periodic update of my IP to DuckDNS in the crontab of the rpi-cctv1.

# /home/sysadmin/duckdns/duck.sh
crontab -e
*/5 * * * * ~/duckdns/duck.sh >/dev/null 2>&1

#—————————————————————————
# Misc
#—————————————————————————
apt-get install -y gpm
systemctl disable gpm

#—————————————————————————
# Cleanup
#—————————————————————————
apt-get autoremove
apt-get clean
apt-get autoclean

Future / Todo List

  • GPU streaming: Try out https://github.com/ccrisan/motioneyeos/wiki/Fast-Network-Camera. This may give higher quality video, but as the author mentions, you lose the ability to detect motion, timestamps, labeling, etc.
  • Motion Detection: Try out the motion detection feature. Check CPU load, might be CPU intensive for a RPI?
  • SMS alert: Setup and test SMS based alert service.
  • Indicator: Repurpose the webcam LED or Raspberry Pi’s SD card read/write access LED, so that the LED flashes when in use (i.e. whenever someone is connected via the browser).
  • Load Test: load test o see just how much concurrent streaming the Raspberry Pi can handle (cpu load, bandwidth usage, and heat).
  • Statistics: Display usage by users, time, temperature, load. Feed it to an ELK stack.
  • Video Capture: Capture video triggered on motion detection and save it to disk remotely, with log rotation for say 24 hours worth of video.
  • RPI Alternatives: Try out upcoming cheaper Raspberry Pi ($40) alternatives: Pine64 ($15) with 64-bit ARM CPU and MALI 400 GPU. CHIP ($10) with WiFi and Bluetooth built-in.
  • Relay Control: attach relay module (with 4 relays) to control 240v devices. Will require housing it in a power-outlet/gang-box, and can use the Raspberry Pi Zero ($5) for this purpose, since USB and Ethernet will not be used. Potentially can use it to turn on/off our bore well water pump motor, but requires pulling actuating wires from DOL Controller because the contactor switch is integrated in the DOL starter (MU-G6 with MK1 contactor).

Other Hardware

Cables

The USB cables for the webcams run along the wall. I recommend getting good quality shielded USB cables with EMI/RF filters for anything 3 meters or more. Even better if you have one on each end. Not sure just how much EMI/RF filters help, especially in image transmission (as opposed to more sensitive data, like that headed for a USB hard drive or a printer), but always worth not having the headache of troubleshooting only to find it is due to cheap cables.

I have a webcam on a 5m cable and the other two on a 3m meter cable, each having two EMI filters, all working fine.

Clamps

For mounting the webcam, I used pipe clamps that you can get in plumping store.

For mounting the wires, I just could not find any 3M self-adhesive cable clips locally (you should find it in electrical shops in cities). But what I did find was self-adhesive cable tie clamps, which worked out very well. Comes in different sizes, got the 10mm one, enough to tie two USB cables.

Self-Adhesive-Cable-Tie-Clamp

I found that the sticky backing (this was a generic brand shipped from China) held well to the painted/cement wall, but did not hold on to varnished/wood surface (or maybe those two clips were duds). If you don’t have clips, good old duck tape works just as well to hold the wires to any surface.

 

Advertisements

Engineering Ethics

“Lund’s first response was to repeat his objections. But then Mason said something that made him think again. Mason asked him to think like a manager rather than an engineer.”

That fatal decision resulted in the Challenger Disaster that cost the lives of all 7 crew members. Was it because Lund was thinking like a manager? or should Lund even be thinking like a manager? Full article: Thinking Like an Engineer.

Little bit of thermal compound…

My 3 yr old Thinkpad T60 overheating… no more :). Streaming video at 360p full screen caused my CPU (fixed at 1 GHz) temperature to shoot up to 93C in just 10min. By using new thermal compound, even after 30min of running the temperature is down to 60C!

The idle temperature is now 42C (before it was 58C). The GPU idle also remained below 58C (before it used to be 75-80C). I used this thermal compound Arctic Silver 5 (the thing had over 4,000 reviews!). Data sheet: Arctic Silver 5. Here’s a cool-down plot (cooling down after 10 minutes of streaming the same video, 360p, full screen, CPU at 1 GHz) produced just by command-line thanks to gnuplot.

Freaky Assembly?

After a looong time, I was debugging some embedded C code and thought I found something freaky:

C code

for (i = 0; i < 1000000; i++);

ARM code disassembly (as generated by GNU ARM gcc)

0x0000019c <main+196>: mov r3, #0 ; 0x0
0x000001a0 <main+200>: str r3, [r11, #-16]
0x000001a4 <main+204>: b 0x1b4 <main+220>
0x000001a8 <main+208>: ldr r3, [r11, #-16]
0x000001ac <main+212>: add r3, r3, #1 ; 0x1
0x000001b0 <main+216>: str r3, [r11, #-16]
0x000001b4 <main+220>: ldr r2, [r11, #-16]
0x000001b8 <main+224>: mov r3, #999424 ; 0xf4000
0x000001bc <main+228>: add r3, r3, #572 ; 0x23c
0x000001c0 <main+232>: add r3, r3, #3 ; 0x3

0x000001c4 <main+236>: cmp r2, r3
0x000001c8 <main+240>: bls 0x1a8 <main+208>

The three highlighted lines above in effect initialize r3 with 999999: first initializes r3 with 999424, then adds 572 to it, then adds 3 to it.

What puzzled me was why couldn’t it do that directly (mov r3, #999999)?

After some scratching my head and plowing through the ARM book: ARM instructions are 32-bit — of which Operand 2 can be only 12-bits. In addition (from the ARM book):

– Of these 12 bits, 8-bits are for data, and 4-bits are used for ROR.
– The ROR bits are in turn multiplied by 2 before being applied on the 8-bits.

The combination of ROR and shifting by 2 greatly extends the range. The assembler automatically does it for you if it sees an operand greater than 8-bits.

This can be a great (but wicked) interview question (I’d never do that to anyone ;-)).

Do verify, here’s the math…

999424 + 572 + 3 is the closest tuples you can get to add up to 999999 using the 12-bit ROR with x2 multiplier for the RoR.

Just for verification, here are the instructions from memory:

1b8: 3D39A0E3 ; 0xE3A0393D
1bc: 8F3F83E2 ; 0xE2833F8F
1c0: 033083E2 ; 0xE2833003

To get 999424 (0x0F4000):
0x0000003D ROR 18 (0x9 x 2) = 0x000F4000 (ROR 18 = LSL 6)
As confirmed by the instruction: E3A03 93D

To get 572 (0x023C):
0x0000008F ROR 30 (0xF x 2) = 0x0000023C (ROR 30 = LSL 2)
As confirmed by the instruction: E2833 F8F

To get 3 (0x0003):
0x00000003 ROR 00 (0x0 x 2) = 0x00000003 (ROR 00 = LSL 0)
As confirmed by the instruction: E2833 003

Note: the LSL is just for convenience, it’s good only if data has all zeros padded on the left (at least enough to cover the LSL).

Balancing Robot

Botka, The Barely Standing Robot. This is one impressive balancing robot. Not even a bit of jitter. Midway through the video the thing takes on some solid whacks and still standing. For comparison: NXTway-G (the Lego Mindstorms NXT uses the Atmel AT91SAM7S ARM processor).

Botka probably uses some sophisticated PID control? fuzzy logic enhanced or a Kalman Filter? given it’s amazing response even in motion. I remember doing Kalman Filters way back in my graduate courses, pretty hairy level of mathematics, but real cool nevertheless once you got a simulation working. Never thought I’d see the daylight of that again.

ARM Assembler

My ARM assembler cheat sheet.

ARM7:

  1. Load-and-Store Architecture
  2. Von Neumann Architecture

ARM7TDMI-S:

T – Thumb architecture extension

  • ARM Instructions are all 32 bit
  • Thumb instructions are all 16 bit
  • Two execution states to select which instruction set to execute

D – Core has debug extensions
M – Core has enhanced multiplier
I – Core has Embedded ICE Macrocell
S – Fully synthesis able

Word = 32-bits
Half-word = 16-bits

Program Counter

The program counter is two instructions ahead. An instruction is 4 bytes, so we’re talking 8 bytes ahead. That is PC + 8. So, the net result is that the program counter is pointing the instruction being fetched, not the instruction being executed. The instruction being executed is at PC-8.

Fetch – PC
Decode – PC-4
Execute – PC-8

Interrupt Vector Table

Reset – 0x00000000
Undefined Instruction – 0x00000004
Software Interrupt – 0x00000008
Prefetch Abort – 0x0000000C
Data Abort – 0x00000010
Reserved – 0x00000014
IRQ – 0x00000018
FIQ – 0x0000001C

The entries in the Interrupt Vector Table are not the addresses of the ISR’s, but pointers to another table the VSR table (Vector Service Routine) which contains the addresses of the ISR. Why not store the ISR address directly in the Interrupt Vector Table? Because a branch instruction is limited in range to 26 bits (64MB). So, instead the IVT entry has the instruction: LDR pc, [pc,#-0xFF0]. This essentially replaces PC with value from VSR.

Example: Any IRQ causes a jump to IRQ vector (0x18)
0x18    LDR pc, [pc,#-0xFF0]  ; Loads PC with the address from VICVectAddr (0xFFFFF030) register.

In effect it does this:  LDR pc, [addr]   ; PC+8+addr

That is, -0xFF0 =  -0x00000FF0 = 0xFFFFF00F+1 = 0xFFFFF010

PC = 0x18 + 8 + -0x0FF0   ; the 8 is because PC is 8 bytes ahead always (i.e. two instructions ahead)
= 0x20 + 0xFFFFF010
= 0xFFFFF030

Exception Handling:

When an exception occurs, the core:

  1. Copies CPSR to SPSR_<mode>
  2. Sets the appropriate CPSR bits: Mode field bits (to enter IRQ mode). Set IRQ disable flag. FIQ is kept enabled to allow for nesting of FIQ over IRQ.
  3. Maps in banked registers.
  4. Stores the return address, i.e. next instruction to be executed (PC+4) in LR_<mode>
  5. Sets the PC to vector address.
  6. The instruction at the vector address is essentially an instruction that loads the exception handler’s address into the PC. The exception handler address is itself fetched from an offset. That is, the 32 byte interrupt vector block (8 interrupt vectors * 4 bytes each) is often followed immediately by a 32 byte address lookup table.

Note: In step 6, one could have the instruction to directly branch to the exception handler’s address (instead of loading the exception handler’s address into the PC), but the branch instructions support an offset of only 26 bits (64MB address range).

To return, the exception handler needs to:

  1. Restore CPSR from SPSR_<mode>
  2. Restore PC from LR_<mode>

Now step 2. is tricky:

  1. In the case of FIQ or IRQ, when an exception occurs the current instruction is discarded. So, when we return from interrupt, we don’t just restore PC from LR, but PC = LR-4, so that the discarded instruction gets re-executed. This is done by:
    SUBS R15, R14, #4    ; Restores the PC from LR, and changes the mode back to User mode.
  2. In the case of an SWI interrupt, the current instruction is not discarded, so we just simply restore PC from LR. This is done by:
    MOVS R15, R14        ; Restores the PC from LR
  3. In the case of DAbt interrupt (Data Abort), the exception occurs after the execution of the current instruction (which is the one that caused the exception), thus causing the next instruction to be discarded. So, when we return from the interrupt, we need to re-execute the instruction that caused the exception. Since the LR contained the PC+4 (i.e. the next instruction), we have to roll back to discarded instruction, plus roll back again to the instruction that caused the exception. This is done by:
    SUBS R15, R14, #8

Note in the above, special instructions (SUBS, MOVS,… – i.e. data processing instructions with S-bit set) are used to restore the PC and change the mode at the same time (when the mode changes the CPSR gets restored from SPSR). This is because if the PC is restored before the CPSR is restored (i.e. CPSR still contains the IRQ handler’s state), it will screw things up. If the CPSR is restored (i.e. operating mode is changed) before the PC is restored then the banked LR which contains the PC will be inaccessable.

Exception Handling (according to Freescale)

  1. Finish current instruction
  2. LR_irq := return link
  3. SPSR_rq := CPSR
  4. CPSR[4:0] := 0x10010   ; Enter IRQ mode
  5. CPSR[5] := 0    ; Put the processor in ARM state
  6. CPSR[7] := 1    ; Disable further interrupts
  7. PC := 0x0018    ; Jump to interrupt vector

CPSR:

  1. CPSR[31:28] – NZCV (Negative, Zero, Carry-over, Overflow)
  2. CPSR[7] – IRQ disable (0=enable/1=disable)
  3. CPSR[6] – FIQ disable (0=enable/1=disable)
  4. CPSR[5] – Thumb Mode (you should not set/unset this bit directly)
  5. CPSR[4:0] – operating mode (FIQ, IRQ, System, User, Undefined Instruction)

R13: Stack Pointer (SP)
R14: Link Register (LR)
R15: Program Counter (PC)

————-

Disable:
mrs r0, cpsr
orr r0,r0,#0x80
msr cpsr_c,r0
mov r0,#1
bx lr

Enable:
mrs r0, cpsr
bic r0,r0,#0x80
msr cpsr_c,r0
bx lr

Subroutine Link Register

The LR (R14) stores the return address when Branch with Link operations are performed, calculated from the PC. Thus to return from a linked branch
• MOV r15,r14
or
• MOV pc,lr

Stack Pointer

The caller pushes the return address onto the stack.
Then calls the function.
The function pops the return address from the stack.

APCS - ARM Procedure Call Standard
    Name    Register    APCS Role

    a1      0           argument 1 / integer result / scratch register
    a2      1           argument 2 / scratch register
    a3      2           argument 3 / scratch register
    a4      3           argument 4 / scratch register

    v1      4           register variable
    v2      5           register variable
    v3      6           register variable
    v4      7           register variable
    v5      8           register variable

    sb/v6   9           static base / register variable
    sl/v7   10          stack limit / stack chunk handle / reg. variable
    fp      11          frame pointer
    ip      12          scratch register / new-sb in inter-link-unit calls
    sp      13          lower end of current stack frame
    lr      14          link address / scratch register
    pc      15          program counter

Types of Stacks

In an Empty stack, the stack pointers points to the next free (empty) location on the stack, i.e. the place where the next item to be pushed onto the stack will be stored.

In a Full stack, the stack pointer points to the topmost item in the stack, i.e. the location of the last item to be pushed onto the stack.

ARM compiler: push    {fp, ip, lr, pc}
is the same as:  STMFD sp!, {fp, ip, lr, pc}

This first pushes in the order: pc, lr, ip, fp  (i.e. PC is pushed in first, and FP last).

ARM Toolchain – Crosstool

Was able to get an arm-elf toolchain built and working fine, but not so much luck in building an arm-elf-linux toolchain. It cross-compiled programs without errors, but the compiled executable crapped out at runtime. So googling for answers… I came across Dan Kegel’s crosstool – a really cool GNU toolchain builder. It downloads all the correct gcc, glibc, binutils, etc. and builds your toolchain. I built two toolchains, arm-unknown-linux and arm-xscale-linux. The toolchain built with it works great.

Note:
The Ubuntu shell is not bash by default! Instead it is linked to something called dash. Just make sure you relink /bin/sh to bash instead of dash. No idea when they did this, but I found that out after encountering this maddening error, pointing to some header files during the build:

missing terminating ” character.

Using it:

Example (for kernel compilation makefile):

export ARM_TOOLCHAIN=/opt2/crosstool/arm-unknown-linux-gnu/bin
export PATH=$ARM_TOOLCHAIN:$PATH

make ARCH=arm CROSS_COMPILE=arm-unknown-linux-gnu-