Skip to content

Using libgit for deduplication yields drastically worse eval performance #9684

Open
@roberth

Description

Originally posted by @erikarvstedt in #9485 (comment)

Using libgit for deduplication yields a very simple implementation but drastically worsens eval performance.

I've mentioned this before in the "Source tree abstraction" PR.
A current benchmark confirms these results:
Eval'ing the paperless NixOS test with the libgit accessor (PR "Source tree abstraction") takes 60% longer compared to the branch the PR is based on.

Benchmark source
# https://github.com/NixOS/nix/pull/6530 as of 2023-12-31
nix build github:edolstra/nix/2055e28aedb3918d91dcd80a5cf9bdd32242c822 -o /tmp/nix/git-accessor
# The version of master which the above PR is based on
nix build github:nixos/nix/9dbfd186b129ddee4a7a66958be9abdf6dd6a668 -o /tmp/nix/baseline

nix_eval() {
    nix=$1
    shift
    # A test with a single VM system evaluation that accesses lots of files
    # nixpkgs-unstable as of 2023-02-01
    /tmp/nix/$nix/bin/nix eval --json github:nixos/nixpkgs/ea692c2ad1afd6384e171eabef4f0887d2b882d3#nixosTests.paperless "$@"
}
export -f nix_eval

hyperfine --warmup 3 \
          'nix_eval baseline --read-only' \
          'nix_eval git-accessor --read-only'

rm -rf /tmp/nix

# Benchmark 1: nix_eval baseline --read-only
#   Time (mean ± σ):      3.595 s ±  0.042 s    [User: 3.221 s, System: 0.368 s]
#   Range (min … max):    3.554 s …  3.684 s    10 runs
#  
# Benchmark 2: nix_eval git-accessor --read-only
#   Time (mean ± σ):      5.786 s ±  0.059 s    [User: 5.273 s, System: 0.460 s]
#   Range (min … max):    5.712 s …  5.909 s    10 runs
#  
# Summary
#   'nix_eval baseline --read-only' ran
#     1.61 ± 0.02 times faster than 'nix_eval git-accessor --read-only'

This issue can also be confirmed with this PR:
Copying a github input (which is already present in the git tarball cache) to the store takes over 3x longer than extracting an already downloaded tarball to the store with Nix master.

Also, extracting tarballs to the libgit cache is very slow. Evaluating a NixOS system flake that has multiple github inputs with cold caches can be 5x slower compared to Nix master, moving the initial eval time into the minutes territory.

Eval perf is already a major pain point of Nix. Adding the libgit penalty makes Nix impractical to use for larger NixOS systems.
I'm sure most users would prefer the existing disk space overhead to heavily decreased perf.
For specific use cases the libgit cache might be useful, so it could be made available via a Nix config option.

Metadata

Assignees

No one assigned

    Labels

    fetchingNetworking with the outside (non-Nix) world, input lockinglanguageThe Nix expression language; parser, interpreter, primops, evaluation, etcperformanceregressionSomething doesn't work anymore

    Type

    No type

    Projects

    • Status

      Defined work

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions