< up >
2023-06-12

let’s boot

Let’s boot some x86 code with qemu.

I rediscovered osdev.org after 13 years of abstinence¹² and started off with the plain-assembly babystep. Then I tried the C based bare-bones kernel to have a bit more convenience.

Minimal working example
Babystep - plain assembler
Bare-Bones - C
Findings
- Environment
- Memory Layout
Resources
Vision

Minimal working example

[ORG 0x7c00]
hang:
    jmp hang

times 510-($-$$) db 0
db 0x55
db 0xAA

This nasm code is all you need to make an infinite blinking cursor:

What is going on?

[ORG 0x7c00]: select the right sector, where the bios expects the bootloader
hang: jmp hang: plain assembly instructing the CPU to infinitely loop by jumping to the jump instruction
times 510-($-$$) db 0: Bootloader needs to be exactly 512 bytes large in size. This pads the bootloader up with zeroes.
- pad 510 bytes instead of 512 to leave two bytes free for the magic number
- $-$$ is a macro calculating <end of file> - <start of file> resulting in the size of the bootloader so far
- db 0 is just a static byte set to zero
0x55aa is the magic number to tell the bios that this is a valid bootloader

Babystep - plain assembler

After the minimal working example, I approached the plain-assembler babystep bootloader, to get a better insight what’s going on.

Learnings: - basic memory alignment - enter protected mode: code tutorial - write to console without interrupt: code tutorial

Bare-Bones - C

It might be asked much operating at such a low level, but some convenience would be nice. C got you covered! At least a bit. The bare-bones bootloader contains has only as much assembler as necessary to load up C. Basically a stack must be allocated and here we go.

Learnings: - bridge between assembler and C - more sophisticated tty

Findings

Environment

Bootloaders aren’t just usual 3 assembler codes in a hosted environment where an OS is providing all kinds of neat stuff like a libc. There is no such thing as ELF or PE, just the bare-metal CPU which we must satisfy. This not only includes valid assembler code but stuff like:

width: 8, 16 or 32 bit instruction width?
location: where needs the code to be placed to in order to get (not) executed. The cpu originally doesn’t care whats data and code, it just jumps to a defined address and interprets the corresponding bytes as instruction.

Memory Layout

In my understanding the assembler file not just contains code and needs to be aligned but is also almost the equivalent of the program that is actual in memory. And because of this, some preprocessor stuff like times 510-($-$$) db 0 has to be made at the end in order it is correctly loaded into the memory.

While researching online about x86 bootloader, two approaches emerged:

manual alignment of the sections via labels, jumps
linker config with map which section goes where

It looks like the later approach is mostly used for serious projects, where the former is just a quick n dirty workaround.

Resources

Additionally to osdev.org, I also found a bunch of other interesting resources along the way.

General:

Tutorials:

Special:

Linux inside handbook
first linux
seabios boot which qemu uses per default for booting

Vision

practice assembly using exercism
implement a basic shell as entry point for various sub programs

Credits to Manawyrm for the inspiration
I already tossed some bits (sorry, german only) 13 years ago but I felt too gatekept and didn’t knew linux.
I couldn’t write that without a laugh. As there would be something like usual assembler…