Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compress-man-pages: Parallelise compression #380948

Open
wants to merge 1 commit into
base: staging
Choose a base branch
from

Conversation

illdefined
Copy link
Contributor

Parallelise compression of man pages using xargs -P.

This can significantly reduce fixup times for packages with a large number of man pages like openssl or linux-manual on multi‐core build machines.

Since the compressing involves a bit of shell logic, I decided to process the pages in batches to balance the gains of parallelism against the cost of spawning a shell and parsing a brief script. The batch size is supplied through the -n argument to xargs.

This overhead could be avoided with a construction that spawns one process per build core and feeds the file paths in through pipes, but I feel that the programmatic complexity may not be worth the benefits.

With a batch size of 128 as proposed in this pull request, the hook should behave exactly the same as before for most packages, except for the extra xargs and shell process.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 25.05 Release Notes (or backporting 24.11 and 25.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

do
if gzip -c -n "$f" > "$f".gz; then
rm "$f"
else
rm "$f".gz
fi
done
done' --
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks odd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention is to terminate option processing for the shell. The path names are passed as arguments by xargs and without -- arguments with a leading - would be interpreted as options.

@illdefined
Copy link
Contributor Author

illdefined commented Feb 11, 2025

On my local machine, parallelisation reduces (real‐time) fixup duration for linux-manual from around 4 minutes to 30 seconds.

I haven’t looked at CPU and I/O wait times.

@illdefined illdefined marked this pull request as ready for review February 11, 2025 10:36
@nix-owners nix-owners bot requested a review from Ericson2314 February 11, 2025 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants