# nix-builder-autoscaler `nix-builder-autoscaler` provides elastic Nix remote builder capacity for [Guardian Project](https://guardianproject.info) jobs using [buildbot-nix](https://github.com/nix-community/buildbot-nix). The reason this was created is because we don't have the budget for dedicated always-on build hardware. So the idea is that Buildbot waits for builder capacity before it starts a `nix-build` job, and idle builders should disappear some time after jobs finish. The autoscaler launches EC2 nix builder instances, waits until they are reachable through Tailscale and HAProxy, hands Buildbot a reservation for a ready slot, and later drains and terminates unused capacity. It uses EC2 Spot by default and can use on-demand instances for nested virtualization workloads when configured. The Buildbot instance has a single master/single worker config like upstream buildbot-nix expects. HAPoroxy is used to present a logical single nix-builder host to buildbot-nix, this was inspired by [Garnix's yensid](https://web.archive.org/web/20260530230732/https://garnix.io/blog/yensid/) ## Pieces The project has two main runtime pieces: 1. `agent/`: the autoscaler daemon and `autoscalerctl` CLI. The daemon owns the slot database, reservation API, scheduler, EC2 runtime, HAProxy binding, health checks, and metrics. 2. `buildbot-ext/`: the Buildbot integration. The extension patches Buildbot `*/nix-build` builders with a capacity gate step at the beginning and a reservation release step at the end. It also lets the Buildbot host send Nix distributed builds through the HAProxy-backed builder cluster. The `nix/modules/` directory contains NixOS modules that package and wire these pieces into hosts: - `services.nix-builder-autoscaler` runs the daemon and can generate the HAProxy slot configuration. - `services.buildbot-nix.nix-build-autoscaler` installs the Buildbot extension and configures Nix remote builder access. ## How it works Buildbot (via the extension) creates a reservation before a Nix build starts. The autoscaler assigns that reservation to a ready slot if one exists. If no ready slot has capacity, the scheduler launches an EC2 instance into an empty slot, subject to the configured minimum, maximum, warm pool, and timeout settings. The reconciler moves each slot through the runtime states: 1. `launching`: EC2 accepted the instance launch. 2. `booting`: the instance is running. 3. `binding`: the daemon found the instance's Tailscale IP and enabled its HAProxy backend slot. 4. `ready`: HAProxy health checks pass and Buildbot can use the slot. 5. `draining` or `terminating`: the slot is being removed after release, idle timeout, interruption, or failure. Buildbot waits until the reservation becomes `ready`, then runs the build through the configured Nix remote builder alias. When the build finishes, the release step releases the reservation. Idle slots drain and terminate after the configured cooldowns. ## Development Common checks: ```sh nix flake check nix build .#nix-builder-autoscaler nix build .#buildbot-autoscale-ext nix fmt ``` Useful local CLI commands against a running daemon: ```sh autoscalerctl status autoscalerctl slots autoscalerctl reservations autoscalerctl drain autoscalerctl reconcile-now ``` The daemon listens on `/run/nix-builder-autoscaler/daemon.sock` by default. NixOS deployments should configure the service modules rather than hand-writing daemon config files.