Diving into toolchains

Frederic Cambus June 08, 2021 [Compilers] [Toolchains]

I've been wanting to learn more about compilers and toolchains in general for a while now. In June 2016, I asked about recommended readings on lexers and parsers on Twitter. However, I have to confess that I didn't go forward with reading the Dragon Book.

Instead, I got involved as a developer in the OpenBSD and NetBSD projects, and witnessing the evolution of toolchains within those systems played a big role in maintaining my interest and fascination in the topic. In retrospect, it now becomes apparent that the work I did on porting and packaging software for those systems really helped to put in perspective how the different parts of the toolchains interact together to produce binaries.

Approximately one year ago, I asked again on Twitter whether I knew anyone having worked on compilers and toolchains professionally to get real world advice on how to gain expertise in the field. I got several interesting answers and started to collect and read more resources on the topic. Some of the links I collected ended up on toolchains.net, a collection of toolchain resources which I put online in February.

But the answer that resonate the most with me was Howard's advice to learn by doing. Because I seem to be the kind of person who need to see some concrete results in order to keep motivated, that's exactly what I decided to do.

I started by doing some cleanups in the binutils package in NetBSD's pkgsrc, which resulted in a series of commits:

2020-12-20ca38479Remove now unneeded OpenBSD specific checks in gold
2020-12-157263eeeAdd missing TEST_DEPENDS on devel/dejagnu
2020-12-14b1637daDon't use hard-coded -ldl in the gold test suite.
2020-12-13146def2Remove apparently unneeded patch for libiberty
2020-12-126b347a9Remove CFLAGS.OpenBSD+= -Wno-bounded directive
2020-12-11f53b2d8Remove now unneeded patch dropping hidden symbols warning
2020-12-10b037380Enable building gold on Linux
2020-12-0375d00bcRemove now unneeded workaround for binutils 2.24
2020-12-03adfee30Drop all Bitrig related patches

Meanwhile, I also got the opportunity to update our package and apply security fixes:

2021-02-11761e000Update to binutils 2.36.1
2021-01-27ba983e5Update to binutils 2.36
2021-01-077aef5c0Add upstream fixes for CVE-2020-35448
2020-12-0699fdf39Update to binutils 2.35.1

I eventually took maintainership of binutils in Pkgsrc.

Building it repeatedly with different compilers exposed different warnings, and I've also run builds through Clang's static analyzer.

All of this resulted in the opportunity to contribute to binutils itself:

2021-04-145f47741Remove unneeded tests for definitions of NT_NETBSDCORE values
2021-04-120fa29e2Remove now unneeded #ifdef check for NT_NETBSD_PAX
2021-03-12be3b926Add values for NetBSD .note.netbsd.ident notes (PaX)
2021-01-26e37709fFix a double free in objcopy's memory freeing code

Most recently, I also wrote a couple of blog posts on the topic:

And the journey continues. I'm following a different path from traditional compiler courses starting with lexers and parsers, and doing the opposite curriculum somehow, starting from binaries instead. I will be focusing on the final stages of the pipeline for now: compiling assembly to machine code and producing binaries.

My next steps are to read the full ELF specification, followed by the Linkers and Loader book, and then refresh my ASM skills. My favorite course at university was the computer architecture one and especially its MIPS assembly part, so I'm looking to revisit the subject but with ARM64 assembly this time.

Back to top