LLVM was imported in the OpenBSD ports tree back in 2008, and happily lived there for a long while before being imported in the source tree at the g2k16 hackathon in 2016. I previously wrote about this in “The state of toolchains in OpenBSD” last year.
As mentioned in my previous article, we do not use upstream build system to build LLVM in the base system, but hand-writen BSD Makefiles. Importing CMake into the base system was not an option, because of the size of the project and the large dependency chain it requires for building. As a drawback, the build is slower than it could be, were we able to take advantage of a more modern build system.
Nowadays, Clang is the default compiler on the amd64, arm64, armv7, i386, macppc, octeon, powerpc64, and riscv64 platforms. It is also available in the sparc64 base system.
But then, why do we still need LLVM in the ports tree? As an aside, for those wondering why we need a compiler in the base system in the first place, Julio Merino wrote about this in his “Compilers in the (BSD) base system” post.
In the OpenBSD base system, we only build LLVM backends for a given architecture, so on amd64 and i386 we build LLVM’s X86 backend. The mapping we do between OpenBSD’s MACHINE_ARCH and LLVM_ARCH values can be found in gnu/usr.bin/clang/Makefile.arch.
Note that we also build the AMDGPU backend on platforms requiring it.
On an amd64 machine, the registered targets for the base compiler are:
$ clang --print-targets Registered Targets: amdgcn - AMD GCN GPUs r600 - AMD GPUs HD2XXX-HD6XXX x86 - 32-bit X86: Pentium-Pro and above x86-64 - 64-bit X86: EM64T and AMD64
And the ones for Clang installed from ports are:
$ clang-13 --print-targets Registered Targets: aarch64 - AArch64 (little endian) aarch64_32 - AArch64 (little endian ILP32) aarch64_be - AArch64 (big endian) amdgcn - AMD GCN GPUs arm - ARM arm64 - ARM64 (little endian) arm64_32 - ARM64 (little endian ILP32) armeb - ARM (big endian) avr - Atmel AVR Microcontroller bpf - BPF (host endian) bpfeb - BPF (big endian) bpfel - BPF (little endian) hexagon - Hexagon lanai - Lanai mips - MIPS (32-bit big endian) mips64 - MIPS (64-bit big endian) mips64el - MIPS (64-bit little endian) mipsel - MIPS (32-bit little endian) msp430 - MSP430 [experimental] nvptx - NVIDIA PTX 32-bit nvptx64 - NVIDIA PTX 64-bit ppc32 - PowerPC 32 ppc32le - PowerPC 32 LE ppc64 - PowerPC 64 ppc64le - PowerPC 64 LE r600 - AMD GPUs HD2XXX-HD6XXX riscv32 - 32-bit RISC-V riscv64 - 64-bit RISC-V sparc - Sparc sparcel - Sparc LE sparcv9 - Sparc V9 systemz - SystemZ thumb - Thumb thumbeb - Thumb (big endian) wasm32 - WebAssembly 32-bit wasm64 - WebAssembly 64-bit x86 - 32-bit X86: Pentium-Pro and above x86-64 - 64-bit X86: EM64T and AMD64 xcore - XCore
The devel/llvm port is built using CMake and Ninja, resulting in more efficient builds. On top of building all available LLVM backends, we also build:
- The Clang Static Analyzer and its companion tool scan-build
- Clang utilities (clang-format and clang-* tools)
- LLVM utilities (LLVM binary utilities: llvm-ar, llvm-as, llvm-objcopy, llvm-objdump, etc.)
- Tools to process code coverage data (llvm-profdata and llvm-cov)
- Various other tools such as llc, lli, llvm-mc, llvm-mca, etc.
So in essence, we try to keep the base system LLVM somewhat minimal, and build additional features and tooling in the port version. This solution has worked well for us so far.
One last thing to note, we only build one version of LLVM in ports, which is kept in sync with the base version, so we do not ship packages for older (or newer) versions of LLVM.