Writing a compiler backend for my programming language in 🦀

Heads up!

Work in progress I sometimes stream the development here

Note

Yes, I don't like LLVM. I know, that cranelift exists, but I want to write my own. You can see the development here

TODO

Table of contents

This all started when I decided to make my own language. Again.
I wanted to make a rust-like language with an effect system and some more features. You can see it on github.
I've tried many approaches, but settled on tokenizing text → parsing it into an AST → building an Intermediate Representation (IR) from this AST → converting this IR into raw bytes → packaging those bytes into an application like ELF or PE.
The first problem I encountered with the backend itself is compiling to x86_64. That's the first architecture I've tried to implement and already got stuck. Turns out that x86_64 instruction set is really difficult (THIS IS ALL ABOUT BACKWARDS COMPATIBILITY!!!). It has some weird things like REX (register extensions) or some shortcuts (Why add AL, 5 is taking 2 bytes (encoded as 04 05) when add BL, 5 takes 3 bytes (encoded as 80 c3 05))
I've found this great page as reference for x86_64 opcodes: X86-64 Instruction Encoding (OSDev Wiki contains the same page) This table also helped me a lot.
When I did some first steps with x86_64, I had to pack it into some sort of executable, which turned out a big challange. I couldn't find any rust crates that will write elf executables, only ones that write relocatibles and I want to integrate a linker into my backend. This video helped me A LOT with reading/writing elf files. But be aware: it has some wrong types! For example (the only one I found) sizes in program header. Check wiki for that!
When I've implemented a minimal functionality for my elf crate, I've tried to compile a simple program and got segmentation fault. This video helped me finally figure it out. Essentially entry point has some strict rules. Finally, I released this crate: orecc-elf