Hello! For some reason after the last [nix post][1] I got nerdsniped by trying to understand how Nix builds work under the hood, so here’s a quick exploration I did today. There are probably some mistakes in here.
> are there any guides to nix that start from the bottom up (for example
> starting with [this bash script][3]
> and then working up the layers of abstraction) instead of from the top down?
>
> all of the guides I’ve seen start by describing the nix programming language
> or other abstractions, and I’d love to see a guide that starts with concepts I
> already understand like compiler flags, linker flags, Makefiles, environment
> variables, and bash scripts.
Ross Light wrote a great blog post in response called [Connecting Bash to Nix][4], that shows how to compile a basic C program without using most of Nix’s standard machinery.
I wanted to take this a tiny bit further and compile a slightly more
complicated C program.
#### the goal: compile a C program, without using Nix’s standard machinery
Our goal is to compile a C program called `paperjam`. This is a real C program that wasn’t in the Nix repository already. I already figured out how to
compile it in [this post][1] by copying and pasting a bunch of stuff I didn’t understand, but this time I wanted to do it in a more principled way where I actually understand more of the steps.
Everything I read about Nix talks about derivations all the time, but I was really struggling to figure out what a derivation _is_. It turns out that `derivation` is a function in the Nix language. But not just any function! The whole point of the Nix language seems to be to to call this function. The [official documentation for the `derivation` function][5] is actually extremely clear. Here’s what I took away:
Let’s write a very simple build script and call the `derivation` function. These don’t work yet, but I found it pretty fun to go through all the errors, fix them one at a time, and learn a little more about how Nix works by fixing them.
-`fetchurl` (which downloads an url and puts the path in to the `SOURCE` environment variable)
-`pkgs` (which lets us depend on other Nix packages from the central repository). I don’t totally understand this but I’m already in a pretty deep rabbit hole so we’re going to leave that for now.
`SOURCE` evaluates to a string – it’s the path to the downloaded source tarball.
Nix needs you to declare all the dependencies for your builds. It forces this by removing your `PATH` environment variable so that you have no binaries in your PATH at all.
So we need to put a compiler in our PATH. For some reason I felt like using `clang++` to compile, not `g++`. To do that I need to make 2 changes to `paperjam.nix`:
- Add the line `CXX="clang++";`
- Add `${pkgs.clang}/bin` to my `PATH`
#### problem 3: missing header files
The next error was:
```
> ./pdf-tools.h:13:10: fatal error: 'qpdf/QPDF.hh' file not found
On my system, the `clang++` wrapper script was at `/nix/store/d929v59l9a3iakvjccqpfqckqa0vflyc-clang-wrapper-11.1.0/bin/clang++`. I searched that file for `LDFLAGS` and found that it uses 2 environment variables:
-`NIX_LDFLAGS_aarch64_apple_darwin`
-`NIX_CFLAGS_COMPILE_aarch64_apple_darwin`
So I figured I needed to put all the arguments to clang in the `NIX_CFLAGS` variable and all the linker arguments in `NIX_LDFLAGS`. Great! Let’s do that.
I added these 2 lines to my `paperjam.nix`, to link the `libpaper` and `qpdf` libraries:
Not sure what this means, but I searched for “abi” in the Nix packages and fixed it by adding `-L ${pkgs.libcxxabi}/lib` to my `NIX_LDFLAGS` environment variable.
#### problem 5: missing iconv
Here’s the next error:
```
> Undefined symbols for architecture arm64:
> "_iconv", referenced from: ...
```
I started by adding `-L ${pkgs.libiconv}/lib` to my `NIX_LDFLAGS` environment variable, but that didn’t fix it. Then I spent a while going around in circles and being confused.
I eventually figured out how to fix this by taking a working version of the `paperjam` build that I’d made before and editing my `clang++` wrapper file to print out all of its environment variables. The `LDFLAGS` environment variable in the working version was different from mine: it had `-liconv` in it.
I was a bit puzzled by this `-liconv` thing though: the original Makefile links in `libqpdf` and `libpaper` by passing `-lqpdf -lpaper`. So why doesn’t it link in iconv, if it requires the iconv library?
I think the reason for this is that the original Makefile assumed that you were running on Linux and using glibc, and glibc includes these iconv functions by default. But I guess Mac OS libc doesn’t include iconv, so we need to explicitly set the linker flag `-liconv` to add the iconv library.
I guess this is some kind of Mac code signing thing. I used `find /nix/store -name codesign_allocate` to find `codesign_allocate` on my system. It’s at `/nix/store/a17dwfwqj5ry734zfv3k1f5n37s4wxns-cctools-binutils-darwin-973.0.1/bin/codesign_allocate`.
But this doesn’t tell us what the package is called – we need to be able to refer to it as `${pkgs.XXXXXXX}` and `${pkgs.cctools-binutils-darwin}` doesn’t work.
I couldn’t figure out a way go from a Nix folder to the name of the package, but I ended up poking around and finding out that it was called `pkgs.darwin.cctools`.
So I added `${pkgs.darwin.cctools}/bin` to the `PATH`.
#### problem 7: missing a2x
Easy, just add `${pkgs.asciidoc}/bin` to the `PATH`.
##### problem 8: missing install
```
make: install: No such file or directory
```
Apparently `install` is a program? This turns out to be in `coreutils`, so we add `${pkgs.coreutils}/bin` to the `PATH`. Adding `coreutils` also fixes some other warnings I was seeing about missing commands like `date`.
#### problem 9: can’t create /usr/local/bin/paperjam
This took me a little while to figure out because I’m not very familiar with make. The Makefile has a `PREFIX` of `/usr/local`, but we want it to be the program’s output directory in `/nix/store/`
I edited the `build-paperjam.sh` shell script to say:
```
make install PREFIX="$out"
```
and everything worked! Hooray!
#### our final configuration
Here’s the final `paperjam.nix`. It’s not so different from what we started with – we just added 4 environment variables.
```
let pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/ae8bdd2de4c23b239b5a771501641d2ef5e027d0.tar.gz") {};
I think what this means is that `paperjam.nix` get compiled to some intermediate representation (also called a derivation?), and then the Nix runtime takes over and is in charge of actually running the build scripts. We can look at this `.drv` intermediate representation with `nix show-derivation`
This feels surprisingly easy to understand – you can see that there are a bunch of environment variables, our bash script, and the paths to our inputs.
Normally when you build a package with Nix, you don’t do all of this stuff yourself. Instead, you use a helper called `stdenv`, which seems to have two parts:
- a function called `stdenv.mkDerivation` which takes some arguments and generates a bunch of environment variables (it seems to be [documented here][6])
- a 1600-line bash build script ([setup.sh][7]) that consumes those environment variables. This is like our `build-paperjam.sh`, but much more generalized.
Together, these two tools:
- add `LDFLAGS` automatically for each C library you depend on
- add `CFLAGS` automatically so that you can get your header files
- run `make`
- depend on clang and coreutils and bash and other core utilities so that you don’t need to add them yourself
- set `system` to your current system
- let you easily add custom bash code to run at various phases of your build
- maybe also manage versions somehow? Not sure about this one.
and probably lots more useful things I don’t know about yet
Let’s look at one more compiled derivation, for `jq`. This is quite long but there are some interesting things in here. I wanted to look at this because I wanted to see what a more typical derivation generated by `stdenv.mkDerivation` looked like.
I thought it was interesting that some of the environment variables in here are actually bash scripts themselves – for example the `postInstallCheck` environment variable is a bash script. Those bash script environment variables are `eval`ed in the main bash script (you can [see that happening in setup.sh here][8])
I feel like I understand Nix a bit better after going through this. I still don’t feel very motivated to learn the Nix language, but now I have some idea of what Nix programs are actually doing under the hood! My understanding is:
- First, `.nix` files get compiled into a `.drv` file, which is mostly a bunch of inputs and outputs and environment variables. This is where the Nix language stops being relevant.
- Then all the environment variables get passed to a build script, which is in charge of doing the actual build
- In the Nix standard environment (`stdenv`), some of those environment variables are themselves bash code, which gets `eval`ed by the big build script `setup.sh`
That’s all! I probably made some mistakes in here, but this was kind of a fun rabbit hole.