Skip to content

Recursion in builders definitions deadlocks builds with waiting for lock on '/nix/store/...' #10740

Open
@tomeon

Description

Describe the bug

I've been trying to set up remote builders following the NixOS wiki's "Distributed build" article and other resources. Because none of the machines in my home lab are particularly beefy, and I want to recruit as much compute as I can, I've got cycles in the builders graph; that is, machines B and C appear in the builders definition for machine A, machines A and C appear in the builders definition for machine B, and machines A and B appear in the builders definition for machine C (and so on).

This appears to lead to deadlocks in builds, one symptom of which is the appearance of warning messages following the pattern waiting for lock on '/nix/store/...'.

If I initiate a build on machine A, I can observe that it starts a nix-daemon --stdio process on machine B, and that machine B in turn starts a nix-daemon --stdio process on machine A.

Steps To Reproduce

On machine A, add machine B to the builders definition in /etc/nix/nix.conf. Similarly, on machine B, add machine A to the builders definition in /etc/nix/nix.conf. Then, execute a nontrivial build (e.g. rebuild a NixOS machine configuration).

Expected behavior

I would expect machines defined in builders to either:

  1. Only run builds locally and not farm out work to their own builders machines, or
  2. Eliminate the build-initiating machine from the machines in builders, if it appears there.

nix-env --version output

$ nix-env --version
nix-env (Nix) 2.18.1

This is the same on the machine that initiates builds as well as on the machines in builders.

Additional context

The proximate cause of this issue may be the same as in #2029.

Priorities

Add 👍 to issues you find important.

Metadata

Assignees

No one assigned

    Labels

    bugidea approvedThe given proposal has been discussed and approved by the Nix team. An implementation is welcome.protocolThings involving the daemon protocol & compatibility issuesremote buildThe SSH store, ssh:, ssh-ng:, ... (split from protocol label 2024-07)scheduling

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions