-
-
Notifications
You must be signed in to change notification settings - Fork 14.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stdenv/setup.sh: fix parallel make #174473
Conversation
@@ -1075,7 +1075,7 @@ buildPhase() { | |||
# Old bash empty array hack | |||
# shellcheck disable=SC2086 | |||
local flagsArray=( | |||
${enableParallelBuilding:+-j${NIX_BUILD_CORES} -l${NIX_BUILD_CORES}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you think about setting this to -l 2.5
or maybe -l 5.0
? I am not sure if that would make sense or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A constant value would not make sense here IMHO. Which value makes sense rather depends on the system and its configuration (i.e. number of cores and nix's max-jobs/cores): A value of 2.0 may be a good choice for a laptop but not for a big server with lots of CPU cores.
If max-jobs is set to the number of cores in the system, it could make sense to set -l <number cores>
to avoid overly high system loads.
I would be interesting to know the motivation why -l${NIX_BUILD_CORES}
was set here in the first place. Maybe @Ericson2314 knows more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't -l${NIX_BUILD_CORES}
a good protection against overloading/DoS-ing build machines? Maybe it could be configurable at runtime, but I personally think the default is good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is good to have safeguards in place. However, when the safeguards cause my builds to run with the equivalent of -j1
on a 64-core machine, it is no longer feasible to use Nix in any way in a professional context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cause my builds to run with the equivalent of -j1 on a 64-core machine
Are you sure? I would imagine it starting out with 64 jobs and if/when system load > 64, then new jobs are delayed until load falls below 64. But that'd mean it should run the overall build with >>1 job.
It would be cool to visualize it :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't
-l${NIX_BUILD_CORES}
a good protection against overloading/DoS-ing build machines? Maybe it could be configurable at runtime, but I personally think the default is good.
It is certainly a protection against overloading. However, the question if it is an efficient protection. The default may be good for the main Hydra build farm. On servers with mixed load, this default does not work not work well: E.g.: on a 48 core machine which dedicates half of its cores to a constant, non-build load, and the other half to nix-build jobs, this results in the named problem of gross under utilization. Running nix with -l24
will not result in the desired result of using 24 cores for the build job, but in nix (or make
) to only use a single core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@centromere @markuskowa: Good point.
Is there something that can be done to move this forward? I am currently affected by this bug. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ericson2314 what do you think?
I am really unsure how this will interact with hydra.
Feedback from someone familiar with the Hydra build farm (and its load problems) is absolutely needed here. To be clear: from what I can judge here, this certainly will have a non-negligible impact on the Hydra load patterns. |
This isn't just about hydra. And I don't think it's good to go without any So indeed the point is to protect the machine from overloading, even though it's quite a crude method. My experience of using the current setting on a 32-core is OK, but I can imagine it could be considered limiting if you have lots of non-CPU load (e.g. from rotating drives). For an easy step, I think it would be nice to allow to override make's |
I am closing this PR, since it is not mergable in its current from and would probably cause real trouble on Hydra's build farm. |
@markuskowa Any change to |
No, we change |
Okay. What do y'all think of this plan?
env["NIX_CORE_LIMIT"] = (format("%d") % settings.coreLimit).str();
NIX_CORE_LIMIT="${NIX_CORE_LIMIT:-$NIX_BUILD_CORES}"
export NIX_CORE_LIMIT
local flagsArray=(
${enableParallelBuilding:+-j${NIX_BUILD_CORES}}
...
)
if ((NIX_CORE_LIMIT > 0)); then
flagsArray+=("-l${NIX_CORE_LIMIT}")
fi |
I had the same solution in mind. Only the name "core limit" should be more aligned to make and ninja wording -> load average. |
@ck3d @vcunat @markuskowa I've submitted some PRs to address this: |
Description of changes
The stdenv setup phase set both the
j
and thel
option formake
to$NIX_BUILD_CORES
(e.g.nix-build
's--cores
option).However, the
l
option sets an upper bound for the system load. If this load is exceeded,make
basically runs with-j 1
.This leads to unwanted behavior and slow builds. For example: on a system with 48 cores and a load of 24,
make
willonly run one job at a time if
$NIX_BUILD_CORES
is set to less than 24, leaving the system under utilized.It is not clear to me why
l
should be set to$NIX_BUILD_CORES
. This PR removes thel
option from the setup phase.For reference from the GNU make manual:
"When the system is heavily loaded, you will probably want to run fewer jobs than when it is lightly loaded. You can use the ‘-l’ option to tell make to limit the number of jobs to run at once, based on the load average. The ‘-l’ or ‘--max-load’ option is followed by a floating-point number. For example,
will not let make start more than one job if the load average is above 2.5. The ‘-l’ option with no following number removes the load limit, if one was given with a previous ‘-l’ option."
Things done
sandbox = true
set innix.conf
? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)nixos/doc/manual/md-to-db.sh
to update generated release notes