Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stdenv/setup.sh: fix parallel make #174473

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions pkgs/stdenv/generic/setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -1075,7 +1075,7 @@ buildPhase() {
# Old bash empty array hack
# shellcheck disable=SC2086
local flagsArray=(
${enableParallelBuilding:+-j${NIX_BUILD_CORES} -l${NIX_BUILD_CORES}}
Copy link
Member

@SuperSandro2000 SuperSandro2000 May 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you think about setting this to -l 2.5 or maybe -l 5.0? I am not sure if that would make sense or not.

Copy link
Member Author

@markuskowa markuskowa May 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A constant value would not make sense here IMHO. Which value makes sense rather depends on the system and its configuration (i.e. number of cores and nix's max-jobs/cores): A value of 2.0 may be a good choice for a laptop but not for a big server with lots of CPU cores.
If max-jobs is set to the number of cores in the system, it could make sense to set -l <number cores> to avoid overly high system loads.
I would be interesting to know the motivation why -l${NIX_BUILD_CORES} was set here in the first place. Maybe @Ericson2314 knows more?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't -l${NIX_BUILD_CORES} a good protection against overloading/DoS-ing build machines? Maybe it could be configurable at runtime, but I personally think the default is good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is good to have safeguards in place. However, when the safeguards cause my builds to run with the equivalent of -j1 on a 64-core machine, it is no longer feasible to use Nix in any way in a professional context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cause my builds to run with the equivalent of -j1 on a 64-core machine

Are you sure? I would imagine it starting out with 64 jobs and if/when system load > 64, then new jobs are delayed until load falls below 64. But that'd mean it should run the overall build with >>1 job.

It would be cool to visualize it :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not able to run builds with cores = 64 because the OOM killer is invoked:

Screen Shot 2022-07-15 at 06 05 27

If I run the builds with cores = 8, -j8 -l8 will be passed to make. This is not good because the system has a load average higher than 8, which causes the builds to slow to a crawl.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't -l${NIX_BUILD_CORES} a good protection against overloading/DoS-ing build machines? Maybe it could be configurable at runtime, but I personally think the default is good.

It is certainly a protection against overloading. However, the question if it is an efficient protection. The default may be good for the main Hydra build farm. On servers with mixed load, this default does not work not work well: E.g.: on a 48 core machine which dedicates half of its cores to a constant, non-build load, and the other half to nix-build jobs, this results in the named problem of gross under utilization. Running nix with -l24 will not result in the desired result of using 24 cores for the build job, but in nix (or make) to only use a single core.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@centromere @markuskowa: Good point.

${enableParallelBuilding:+-j${NIX_BUILD_CORES}}
SHELL=$SHELL
$makeFlags "${makeFlagsArray[@]}"
$buildFlags "${buildFlagsArray[@]}"
Expand Down Expand Up @@ -1114,7 +1114,7 @@ checkPhase() {
# Old bash empty array hack
# shellcheck disable=SC2086
local flagsArray=(
${enableParallelChecking:+-j${NIX_BUILD_CORES} -l${NIX_BUILD_CORES}}
${enableParallelChecking:+-j${NIX_BUILD_CORES}}
SHELL=$SHELL
$makeFlags "${makeFlagsArray[@]}"
${checkFlags:-VERBOSE=y} "${checkFlagsArray[@]}"
Expand Down Expand Up @@ -1248,7 +1248,7 @@ installCheckPhase() {
# Old bash empty array hack
# shellcheck disable=SC2086
local flagsArray=(
${enableParallelChecking:+-j${NIX_BUILD_CORES} -l${NIX_BUILD_CORES}}
${enableParallelChecking:+-j${NIX_BUILD_CORES}}
SHELL=$SHELL
$makeFlags "${makeFlagsArray[@]}"
$installCheckFlags "${installCheckFlagsArray[@]}"
Expand Down