Blink on bare metal Raspberry pi 4 with rust
After some time banging my head with old tutorials with raspberry pi 3 or other complexes with support for multiple boards I decided to create my own tutorial, the simplest possible program, the “hello word” of embedded: blink a led.
Already has a lot of tutorials for embedded systems on Rust, but Raspberry uses an Application-profile processor which allows us to have fun exploring OS stuff. Also is one common hardware that people already have.
The smallest #![no_std]
program
(This is an adaption of The Embedonomicon first chapter with the same for Raspberry pi code)
What does #![no_std]
mean?
#![no_std]
is a crate-level attribute that indicates that the crate will link to the [core](https://doc.rust-lang.org/core/)
crate instead of the [std](https://doc.rust-lang.org/std/)
crate, but what does this mean for applications?
The std
crate is Rust’s standard library. It contains functionality that assumes that the program will run on an operating system rather than directly on the metal. std
also assumes that the operating system is a general-purpose operating system, like the ones one would find in servers and desktops. For this reason, std
provides a standard API over functionality one usually finds in such operating systems: Threads, files, sockets, a filesystem, processes, etc.
On the other hand, the core
crate is a subset of the std
crate that makes zero assumptions about the system the program will run on. As such, it provides APIs for language primitives like floats, strings, and slices, as well as APIs that expose processor features like atomic operations and SIMD instructions. However, it lacks APIs for anything that involves heap memory allocations and I/O.
For an application, std
does more than just provide a way to access OS abstractions. std
also takes care of, among other things, setting up stack overflow protection, processing command line arguments, and spawning the main thread before a program’s main
function is invoked. A #![no_std]
application lacks all that standard runtime, so it must initialize its own runtime if any is required.
Because of these properties, a #![no_std]
application can be the first and/or the only code that runs on a system. It can be many things that a standard Rust application can never be, for example:
- The kernel of an OS.
- Firmware.
- A bootloader.
The code
With that out of the way, we can move on to the smallest #![no_std]
program that compiles:
$ cargo new --bin kernel
$ cd kernel
$ # modify main.rs so it has these contents
$ cat src/main.rs
#![no_main]
#![no_std]
use core::panic::PanicInfo;
#[panic_handler]
fn panic(_panic: &PanicInfo<'_>) -> ! {
loop {}
}
This program contains some things that you won’t see in standard Rust programs:
The #![no_std]
attribute which we have already extensively covered.
The #![no_main]
attribute which means that the program won’t use the standard main
function as its entry point. At the time of writing, Rust’s main
interface makes some assumptions about the environment the program executes in: For example, it assumes the existence of command line arguments, so in general, it’s not appropriate for #![no_std]
programs.
The #[panic_handler]
attribute. The function marked with this attribute defines the behavior of panics, both library-level panics (core::panic!
) and language-level panics (out-of-bounds indexing).
This program doesn’t produce anything useful. In fact, it will produce an empty binary.
For now, we also need to disable test
and brench
environments, they don’t work with your panic implementation. So let’s do this on Cargo.toml
[package]
name = "kernel"
version = "0.1.0"
edition = "2021"
[dependencies]
[[bin]]
name = "kernel8"
path = "src/main.rs
test = false
bench = false
To compile you will need cross-compile, your machine (probably x86-64 will generate) armv8 binary.
$ rustup target add aarch64-unknown-none-softfloat
The aarch64-unknown-none-softfloat
is a target triple, that represents the <architecture>-<vendor>-<sys>-<env>
. So we are generating binary for ARM64, without an operation system and using emulate float by software, without FPU.
Now let’s build the binary
$ cargo build --target aarch64-unknown-none-softfloat --release
Let’s see the final binary:
$ # equivalent to `size target/aarch64-unknown-none-softfloat/release/kernel`
$ cargo size --target aarch64-unknown-none-softfloat --release --bin kernel
Finished release [optimized] target(s) in 0.01s
text data bss dec hex filename
0 0 0 0 0 kernel1
Before linking, the crate contains the panicking symbol.
$ cargo rustc --target aarch64-unknown-none-softfloat -- --emit=obj
$ cargo nm -- target/thumbv7m-none-eabi/debug/deps/app-*.o | grep '[0-9]* [^N] '
00000000 T rust_begin_unwind
However, it’s our starting point. In the next section, we’ll build something useful. But before continuing, let’s set a default build target to avoid having to pass the --target
flag to every Cargo invocation.
$ mkdir .cargo
$ # modify .cargo/config so it has these contents
$ cat .cargo/config
[build]
target = "aarch64-unknown-none-softfloat"
Memory layout
(This is an adaption of The Embedonomicon second chapter, using stirnemann and rust-raspberrypi-OS code)
The Raspberry bootloader is located on EEPROM and I don’t wanna it’s hard to change, so we need to locate the entry point of the application in the same spot them the Linux (0x80000).
So we need to config the final memory layout of programs, we can use linker scripts to have some control over it. The control granularity that linker scripts give us over the layout is at the level of sections. A section is a collection of symbols laid out in contiguous memory. Symbols, in turn, can be data (a static variable), or instructions (a Rust function).
Every symbol has a name assigned by the compiler. As of Rust 1.28 , the names that the Rust compiler assigns to symbols are of the form: _ZN5krate6module8function17he1dfc17c86fe16daE
, which demangles to krate::module::function::he1dfc17c86fe16da
where krate::module::function
is the path of the function or variable and he1dfc17c86fe16da
is some sort of hash. The Rust compiler will place each symbol into its own unique section; for example the symbol mentioned before will be placed in a section named .text._ZN5krate6module8function17he1dfc17c86fe16daE
.
These compiler-generated symbols and section names are not guaranteed to remain constant across different releases of the Rust compiler. However, the language lets us control symbol names and section placement via these attributes:
#[export_name = "foo"]
sets the symbol name tofoo
.#[no_mangle]
means: use the function or variable name (not its full path) as its symbol name.#[no_mangle] fn bar()
will produce a symbol namedbar
.#[link_section = ".bar"]
places the symbol in a section named.bar
.
With these attributes, we can expose a stable ABI of the program and use it in the linker script.
Rust side
So let’s write our kernel entry point, I’m calling this _start
as the C runtime does. This is the function that the device will execute after a system reset, or after it powers up for the first time. The reset handler is always the first stack frame in the hardware call stack; returning from it is undefined
behavior as there’s no other stack frame to return to. We can enforce that the reset handler never
returns by making it a divergent function, which is a function with signature fn(/* .. */) -> !
. and using an infinite loop.
#[no_mangle]
#[link_section = ".text._start"]
pub extern "C" fn _start() -> ! {
loop {}
}
The hardware expects a certain format here, to which we adhere by using extern "C"
to tell the
compiler to lower the function using the C ABI, instead of the Rust ABI, which is unstable.
To refer to the reset handler from the linker script, we need them to have a stable symbol name so we use #[no_mangle]
. We need fine control over the location of _start
, so we
place it in a known section, .text._start
. The exact location of the rest of the code isn’t important so we just stick to the default compiler generated section.
The linker will ignore symbols with internal linkage (also known as internal symbols) while traversing the list of input object files, so we need our symbol to have external linkage. The only way to make a symbol external in Rust is to make its corresponding item public (pub
) and reachable (no private module between the item and the root of the crate).
The linker script side
A minimal linker script that places the vector table in the correct location is shown below. Let’s walk through it.
$ cat linker.ld
SECTIONS
{
. = 0x80000;
.text :
{
*(.text._start)
*(.text*)
}
/DISCARD/ :
{
*(.ARM.exidx .ARM.exidx.*);
}
}
Putting it all together
Now we can link the application.
We have to tweak the linker process to make it use our linker script. This is done by passing the -C link-arg
flag to rustc
. This can be done with cargo-rustc
or cargo-build
.
IMPORTANT: Make sure you have the .cargo/config
file that was added at the
end of the last section before running this command.
Using the cargo-rustc
sub-command:
$ cargo rustc -- -C link-arg=-Tlinker.ld
Or you can set the rust flags in .cargo/config.toml
and continue using the cargo-build
sub-command.
# modify .cargo/config so it has these contents
$ cat .cargo/config.toml
[target.'cfg(all(target_arch = "aarch64", target_os = "none"))']
rustflags = ["-C", "link-arg=-Tlinker.ld"]
[build]
target = "thumbv7m-none-eabi"
The [target.'cfg(all(target_arch = "aarch64", target_os = "none"))']
part says that these flags will only be used when cross compiling to any target with with aarch64
architecture and no OS.
Inspecting it
Now let’s use cargo-binutils to inspect the output binary and confirm the memory layout looks the way we want. For reference, here’s the complete Rust program:
#![no_main]
#![no_std]
use core::panic::PanicInfo;
#[panic_handler]
fn panic(_panic: &PanicInfo<'_>) -> ! {
loop {}
}
#[no_mangle]
#[link_section = ".text._start"]
pub extern "C" fn _start() -> ! {
loop {}
}
First, install cargo-binutils
with cargo.
$ cargo install cargo-binutils
$ rustup component add llvm-tools-previewrustup component add llvm-tools-preview
Then, use the object dump to read the binary. The cargo-binutils
is smart and looks to rebuild your code if necessary and also sees the target on .cargo/config.toml
.
$ cargo objdump -- -d --no-show-raw-insn
$ cargo objdump --bin kernel8 -- -d --no-show-raw-insn
Compiling kernel v0.1.0 (/home/guiss/p/labproc/tutorial/kernel)
Finished dev [unoptimized + debuginfo] target(s) in 0.04s
kernel8: file format elf64-littleaarch64
Disassembly of section .text:
0000000000080000 <_start>:
80000: b 0x80004 <_start+0x4>
80004: b 0x80004 <_start+0x4>
We can see that your _start
section is located on 0x0000000000080000
and implements the loop on position 0x80004
(if you run with --release
it’s only one instruction)
LED blinking
Addresses
The arm peripherals are mapped in memory, all I/O configs are made by writing and reading a specific address on memory, we can find this information on BCM2711 peripherals datasheet.
The datasheets use the Legacy Master addresses, but the Raspberry use the “Low Peripheral” mode by default. So we need to translate the addresses 0x73nn_nnnn
to 0xFEnn_nnnn
.
GPIO
We want to blink the green led built-in on board, that is pin 42 (this could be found on Linux device tree), so we need to control the GPIO. Let’s check session 5.2 Register View on the datasheet.
Calculating the offset and replacing the low peripheral address:
0xFE20_0000
: GPIO Function Select 40xFE20_0020
: GPIO Pin Output Set 00xFE20_002c
: GPIO Pin Output Clear
Before all, we need to config what type of our pin, if is output, or input, etc. For this, we need to write 0b001
on bits 8..6
on GPFSELF4
, as we can see on the datasheet.
And writing 1
in GPSET1[15]
we can enable the GPIO42 output, and by writing 1
in GPCLR1[15]
we disable the GPIO42 output.
Show me the code!
With that spoken, finally, we could write our blink code.
#![no_main]
use core::panic::PanicInfo;
use core::arch;
#[panic_handler]
fn panic(_panic: &PanicInfo<'_>) -> ! {
loop {}
}
// https://datasheets.raspberrypi.com/bcm2711/bcm2711-peripherals.pdf
const START: usize = 0xFE00_0000; // Based on section 1.2 of manual
const GPIO_OFFSET: usize = 0x0020_0000; // Based on section 5.2 of manual, also check that
// 0x7enn_nnnn is mapped to 0xFEnn_nnnn
const GPIO_START: usize = START + GPIO_OFFSET;
#[no_mangle]
#[link_section = ".text._start"]
pub extern "C" fn _start() -> ! {
unsafe {
// Set GPIO 42 with GPFSEL4[8:6] = 0b001
core::ptr::write_volatile((GPIO_START + 0x10) as *mut u32, 0b001 << 6);
loop {
// Set GPIO 42 to HIGH with GPSET1
core::ptr::write_volatile((GPIO_START + 0x20) as *mut u32, 1 << (42-32));
// Wait
for _ in 0..500000 {
arch::asm!("nop");
}
// Set GPIO 42 to LOW with GPCLR[42-32]
core::ptr::write_volatile((GPIO_START + 0x02c) as *mut u32, 1 << (42-32));
// Wait
for _ in 0..500000 {
arch::asm!("nop");
}
}
}
}
Here we need the write_volatile to assert that compiler optimization will not remove your instruction (you could read more about that in the embedded rust book).
Programming the raspberry
Before seeing some light we first need to write our code on the SD to put in the Raspberry.
Format the SD card
IMPORTANT: If you don’t know what you are doing, I deeply recommend installing Raspbian with the Raspberry tool and only replacing the kernel. Formatting the wrong partition could delete all your data.
For those who keep reading let’s start. First, you need to connect your SD card to your pc, then you need to find the device. I like to use fdisk -l
and search for new devices, in my case is /dev/sda
which already has RaspebianOS partitions. So we need to format and create the new FAT32 partitions.
$ sudo fdisk /dev/<device>
Command (m for help): d # delete existent partitions
Command (m for help): d # keep deleting ...
Command (m for help): n
Partition type
p primary (0 primary, 0 extended, 4 free)
e extended (container for logical partitions)
Select (default p): p
Partition number (1-4, default 1): 1
First sector (2048-124735487, default 2048): <enter>
Last sector, +/-sectors or +/-size{K,M,G,T,P} (2048-124735487, default 124735487): <enter>
# change partition to FAT32
Command (m for help): t
Hex code or alias (type L to list all): 0x0c
Changed type of partition 'Linux' to 'W95 FAT32 (LBA)'.
Command (m for help): w
$ sudo mkfs.vfat -F32 /dev/<device>
Setup default files
$ sudo mkdir -p /mnt/sdcard
$ sudo mount /dev/<device>1 /mnt/sdcard
$ sudo curl -L -o /mnt/sdcard/fixup4.dat https://github.com/raspberrypi/firmware/raw/master/boot/fixup4.dat
$ sudo curl -L -o /mnt/sdcard/start4.elf https://github.com/raspberrypi/firmware/raw/master/boot/start4.elf
$ sudo curl -L -o /mnt/sdcard/bcm2711-rpi-4-b.dtb https://github.com/raspberrypi/firmware/raw/master/boot/bcm2711-rpi-4-b.dtb
$ sudo touch /mnt/sdcard/config.txt
We are keeping config.txt
empty by now, as we are naming our kernel as kernel8.img
the bootloader already knows that is arm64bits, but if you are paranoid you can put arm_64bit=1
Copy our code to raspberry
The current binary is in ELF format and we need to flatten it to a raw binary. We can do that with objcopy
.
$ cargo objcopy --release -- -O binary kernel8.img
$ sudo cp ./kernel8.img /mnt/sdcard/kernel8.img
$ sudo umount /mnt/sdcard
And the SD card is ready to be plugged in the Raspi. And you could see the light!
We ❤️ cargo&crates
Let’s install some dependencies to make our life easier and make the C/C++ people jealous.
We could use the aarch64 processor crate to run specific asm instructions and read/write CPU registers (like arm co-processors).
$ cargo add aarch64-cpu
$ cargo add tock-register # we will need this later to read register value
diff --git a/src/main.rs b/src/main.rs
index db65813..9737a2a 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -2,7 +2,7 @@
#![no_std]
use core::panic::PanicInfo;
-use core::arch;
+use aarch64_cpu::asm::nop;
#[panic_handler]
fn panic(_panic: &PanicInfo<'_>) -> ! {
@@ -27,7 +27,7 @@ pub extern "C" fn _start() -> ! {
// Wait
for _ in 0..500000 {
- arch::asm!("nop");
+ nop();
}
// Set GPIO 42 to LOW with GPCLR[42-32]
@@ -35,7 +35,7 @@ pub extern "C" fn _start() -> ! {
// Wait
for _ in 0..500000 {
- arch::asm!("nop");
+ nop();
}
}
}
Also, we could use a crate to implement the panic
$ cargo add panic-halt
diff --git a/src/main.rs b/src/main.rs
index 9737a2a..82d0f84 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -1,13 +1,9 @@
#![no_main]
#![no_std]
-use core::panic::PanicInfo;
use aarch64_cpu::asm::nop;
-#[panic_handler]
-fn panic(_panic: &PanicInfo<'_>) -> ! {
- loop {}
-}
+use panic_halt as _;
A little multi-core detail
When we start the CPU, all four cores will execute the same code, so now our code is written 4 times on the GPIO in parallel.
To fix that we need to check what core we are executing (reding MPIDR_EL1 register) and half 3 cores. In this case, we choose the CPU0 to run our core, the other cores will sleep (with wfe
instruction).
diff --git a/src/main.rs b/src/main.rs
index 82d0f84..ea331ef 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -4,6 +4,7 @@
use aarch64_cpu::asm::nop;
use panic_halt as _;
+use tock_registers::interfaces::Readable;
// https://datasheets.raspberrypi.com/bcm2711/bcm2711-peripherals.pdf
const START: usize = 0xFE00_0000; // Based on section 1.2 of manual
@@ -14,6 +15,10 @@ const GPIO_START: usize = START + GPIO_OFFSET;
#[no_mangle]
#[link_section = ".text._start"]
pub extern "C" fn _start() -> ! {
+ if get_cpu_id() != 0 {
+ aarch64_cpu::asm::wfe();
+ }
+
unsafe {
// Set GPIO 42 with GPFSEL4[8:6] = 0b001
core::ptr::write_volatile((GPIO_START + 0x10) as *mut u32, 0b001 << 6);
@@ -36,3 +41,7 @@ pub extern "C" fn _start() -> ! {
}
}
}
+
+fn get_cpu_id() -> u64 {
+ aarch64_cpu::registers::MPIDR_EL1.get() & 0b11
+}
Conclusion
Thank you for the amazing class that gave me the push to get here.
The code is avaliable on this repo.
If you have some trouble or recommendation please write-me on twitter or email.
References:
- Bare metal Raspberry PI blink In RUST - Stirnemann
- rust-embedded/rust-raspberrypi-OS-tutorials
- BCM2711 ARM Peripherals Datasheet
- Bare metal LED access on RPi 4 - Stackexchange
- The Embedonomicon
- The Embedded Rust Book
- rust-embedded/cortex-m-quickstart
- Raspberry throbleshooting booting partition
- bru4bas/labproc/raspberry/led (btw my professor)