Frederic Cambus

Blog · Git · Contact

Diving into toolchains


I've been wanting to learn more about compilers and toolchains in general for a while now. In June 2016, I asked about recommended readings on lexers and parsers on Twitter. However, I have to confess that I didn't go forward with reading the Dragon Book.

Instead, I got involved as a developer in the OpenBSD and NetBSD projects, and witnessing the evolution of toolchains within those systems played a big role in maintaining my interest and fascination in the topic. In retrospective, it now becomes apparent that the work I did on porting and packaging software for those systems really helped to put in perspective how the different parts of the toolchains interact together to produce binaries.

Approximately one year ago, I asked again on Twitter whether I knew anyone having worked on compilers and toolchains professionally to get real world advice on how to gain expertise in the field. I got several interesting answers and started to collect and read more resources on the topic. Some of the links I collected ended up on, a collection of toolchain resources which I put online in February.

But the answer that resonate the most with me was Howard's advice to learn by doing. Because I seem to be the kind of person which need to see some concrete results in order to keep motivated, that's exactly what I decided to do.

I started by doing some cleanups in the binutils package in NetBSD's pkgsrc, which resulted in a series of commits:

Meanwhile, I also got the opportunity to update our package and apply security fixes:

I eventually took maintainership of binutils in Pkgsrc.

Building it repeatedly with different compilers exposed different warnings, and I've also run builds through Clang's static analyzer.

All of this resulted in the opportunity to contribute to binutils itself:

Most recently, I also wrote a couple of blog posts on the topic:

And the journey continues. I'm following a different path from traditional compiler courses starting with lexers and parsers, and doing the opposite curriculum somehow, starting from binaries instead. I will be focusing on the final stages of the pipeline for now: compiling assembly to machine code and producing binaries.

My next steps are to read the full ELF specification, followed by the Linkers and Loader book, and then refresh my ASM skills. My favorite course at university was the computer architecture one and especially its MIPS assembly part, so I'm looking to revisit the subject but with ARM64 assembly this time.