Fedran Infrastructure Redux

I know I said that I was going to work on writing after working on the previous post, but I tried to speed up the pipelines because an hour was overwhelming and bothered me.

Thanks to a bit of obsession, I figured it out.

The bulk of the pipeline was spent download and building Nix packages. This normally isn't a problem, but I want to be able to run an agent in my home lab where I'm charged for bandwidth overages. Pulling down half a gig of packages every time I built my website was also not being a good steward of the Internet as a whole.

I had bookmarked a link to kotatsuyaki's Locally Cached Nix CI with Woodpecker which was based on Kevin Cox's work and tried it a couple of times. I usually got hung up on switching from Docker to Podman but eventually got it mostly working, hit some snags, and found a slightly-less-than-optimal approach instead.

Lazy Configuration

It took me a day to figure out how to get Podman working with my setup. Naturally, OCI images in Nix change name so it was docker-woodpecker-server in Docker and podman-woodpecker-server on Podman. This meant I had to also switch the various systemd links that tie in changing secrets with restarting the OCI images.

As usual, once I realized I was going back and forth (usually on the third time), I decided to automate that.

# woodpecker-agent.nix
{
  config,
  pkgs,
  lib,
  ...
}: let
  tag = "next-eaae6b44c7";
  container =
    if config.virtualisation.podman.enable
    then "podman"
    else "docker";
in {
  sops.secrets = {
    woodpecker-ci-agent = {
      restartUnits = ["${container}-woodpecker-ci-agent.service"];
    };
  };

I'm not fond of Nix as a language, but it was nice being able to have it pick up which configuration was set up properly and then rename the various systemd units as appropriate.

Mounting Nix Store

The crux of the problem was storing Nix packages so I didn't have to download them every time I built the package. This is really important when I start doing mass changes to my writing projects since one of those causes ~137 pipelines to be triggered. Since each one uses an identical flake.lock (because of the Rust CLI), caching those packages also means that it can generate each of the PDFs and EPUBs automatically when I update packages, change style, or need to rebuild things.

Originally I tried the example in the above link and it worked beautifully. I did have to make the pipeline trusted, but I run my own CI server and I don't build pull requests, so I have that locked down to avoid too much security exposure.

# .woodpecker.yaml
pipeline:
  run-greet:
    image: nixos/nix
    commands:
      - echo 'experimental-features = flakes nix-command' >> /etc/nix/nix.conf
      - nix run --store unix:///mnt/nix/var/nix/daemon-socket/socket?root=/mnt .#greet -L
    volumes:
      - /nix:/mnt/nix:ro

I do think I may have had better luck if I included Kevin's command in the above link.

I have seen some issues when using –store. I found that this can be fixed by additionally passing –eval-store local.

Though by the time I realized I should have tried that, I found a working solution and needed to stop messing with an incremental change without significant improvement over what I had.

When I migrated my Fedran pipeline over to the above --store implementation, it blew up with being unable to link DLLs/shared libraries. First it was sodium23, then lowdown, then something else. Each time, I added more packages to the build until I hit libnixeval.so. That one… I couldn't fix.

The main problem was the run-greet pipeline didn't need to call the nix executable, but I was using Standard which does. That means, I couldn't use std //cli/apps/default:run to run anything which basically meant I had to use cargo build which negated the entire purpose of using Nix packages to cache. I also couldn't use nix run . to take advantage of the Nix caching.

In the end, I switched back to Docker but with Kevin's first suggestion: just create a volume in Docker to share the Nix store. This worked out well… but I had to make a tweak from Kevin's original suggestion:

# .gitlab.yaml (I don't remember what it is called anymore)
[[runners]]
    executor = "docker"
    [runners.docker]
        volumes = ["/nix"]

I found it worked better if I used a named volume:

# .woodpecker.yaml
pipeline:
    deploy:
        image: nixpkgs/nix-flakes
        commands:
            - nix develop --command ./src/website/scripts/ci.sh
        when:
            event: [push, manual, cron]
            branch: main
        volumes:
            - woodpecker-nix-store:/nix # named instead of just `- /nix`

And that worked beautifully. The build time when down from 48 minutes for the last run to 6 minutes at the most current one. Plus it barely downloaded anything thanks to using Crane for Rust in Nix, and the general package management. If I created a second store to keep the Git repositories and only did a git pull instead, I could shave it down even more.

Cleanup

It did occur to me that if I left what I figured out as-is, sooner or later I would run out of space. (I also was reminded by I always create a dedicated /var/lib/docker partition in this). So on my nightly job, I added this stanza to my .woodpecker.yaml file (which I haven't tested very well):

# .woodpecker.yaml
pipeline:
    clean:
        image: nixpkgs/nix-flakes
        commands:
            - nix-collect-garbage --delete-older-than 15d
        when:
            event: [cron]
            branch: main
        volumes:
            - woodpecker-nix-store:/nix

If I did it right, then it should keep everything cleaned up and tidy. I'll find out in a month or so if that is true.

Checkout

One thing to be said about having so many project pipelines is that I hammer my system when I do a mass change (updating the locks for example). This meant I encountered a strange bug where Woodpecker's checkout plugin starts to fail somewhere around the tenth rapid-fire pipeline.

To get around it, I switched from using the built-in clone to do it manually:

# .woodpecker.yaml
skip_clone: true

pipeline:
    clone:
        image: alpine/git
        commands:
            - git clone https://fedran:$GITEA_TOKEN@src.mfgames.com/$DRONE_REPO.git . --depth 1
            - git checkout $DRONE_COMMIT
        secrets:
            - gitea_token
        when:
            event: [push, manual, cron]
            branch: main

Since I put it in, I haven't seen check out failures due to being unable to get the user name and password from the prompt. I could have also used nixpkgs/nix-flakes instead of alpine/git, but didn't. This works and it's fast.

What's Next

I really need to write. I got a closure point. Everything works, I wrote issues up for the ones that aren't breaking, so I can stop obsessing and instead focus on the 10 kwd obligation I have by the end of the month. Which is to say… probably not going to happen but I'm going to still try.

Metadata

Categories:

Tags: