let’s boot
Let’s boot some x86 code with qemu.
I rediscovered osdev.org after 13 years of abstinence12 and started off with the plain-assembly babystep. Then I tried the C based bare-bones kernel to have a bit more convenience.
Minimal working example
[ORG 0x7c00]
hang:
jmp hang
times 510-($-$$) db 0
db 0x55
db 0xAA
This nasm code is all you need to make an infinite blinking cursor:
What is going on?
[ORG 0x7c00]
: select the right sector, where the bios expects the bootloaderhang: jmp hang
: plain assembly instructing the CPU to infinitely loop by jumping to the jump instructiontimes 510-($-$$) db 0
: Bootloader needs to be exactly 512 bytes large in size. This pads the bootloader up with zeroes.- pad 510 bytes instead of 512 to leave two bytes free for the magic number
$-$$
is a macro calculating<end of file> - <start of file>
resulting in the size of the bootloader so fardb 0
is just a static byte set to zero
0x55aa
is the magic number to tell the bios that this is a valid bootloader
Babystep - plain assembler
After the minimal working example, I approached the plain-assembler babystep bootloader, to get a better insight what’s going on.
Learnings: - basic memory alignment - enter protected mode: code tutorial - write to console without interrupt: code tutorial
Bare-Bones - C
It might be asked much operating at such a low level, but some convenience would be nice. C got you covered! At least a bit. The bare-bones bootloader contains has only as much assembler as necessary to load up C. Basically a stack must be allocated and here we go.
Learnings: - bridge between assembler and C - more sophisticated tty
Findings
Environment
Bootloaders aren’t just usual3 assembler codes in a hosted environment where an OS is providing all kinds of neat stuff like a libc. There is no such thing as ELF or PE, just the bare-metal CPU which we must satisfy. This not only includes valid assembler code but stuff like:
- width: 8, 16 or 32 bit instruction width?
- location: where needs the code to be placed to in order to get (not) executed. The cpu originally doesn’t care whats data and code, it just jumps to a defined address and interprets the corresponding bytes as instruction.
Memory Layout
In my understanding the assembler file not just contains code and needs to be aligned but is also almost the equivalent of the program that is actual in memory.
And because of this, some preprocessor stuff like times 510-($-$$) db 0
has to be made at the end in order it is correctly loaded into the memory.
While researching online about x86 bootloader, two approaches emerged:
- manual alignment of the sections via labels, jumps
- linker config with map which section goes where
It looks like the later approach is mostly used for serious projects, where the former is just a quick n dirty workaround.
Resources
Additionally to osdev.org, I also found a bunch of other interesting resources along the way.
General:
Tutorials:
- Babystep tutorial with pure assembler
- Example bootloader into c
- Bootloader tutorial
- NASM Tutorial
- German tutorial
Special:
- Linux inside handbook
- first linux
- seabios boot which qemu uses per default for booting
Vision
- practice assembly using exercism
- implement a basic shell as entry point for various sub programs
- Credits to Manawyrm for the inspiration
- I already tossed some bits (sorry, german only) 13 years ago but I felt too gatekept and didn’t knew linux.
- I couldn’t write that without a laugh. As there would be something like usual assembler…