-
-
Notifications
You must be signed in to change notification settings - Fork 14.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python fix venv #358823
base: master
Are you sure you want to change the base?
Python fix venv #358823
Conversation
05c1cc3
to
94ebc80
Compare
The way we build python environments is subtly broken. A python environment should be semantically identical to a vanilla Python installation in, say, /usr/local. The current implementation, however, differs in two important ways. The first is that it's impossible to use python packages from the environment in python virtual environments. The second is that the nix-generated environment appears to be a venv, but it's not. This commit changes the way python environments are built: * When generating wrappers for python executables, we inherit argv[0] from the wrapper. This causes python to initialize its configuration in the environment with all the correct paths. * We remove the sitecustomize.py file from the base python package. This file was used tweak the python configuration after it was incorrectly initialized. That's no longer necessary. The end result is that python environments no longer appear to be venvs, and behave more like a vanilla python installation. In addition it's possible to create a venv using an environment and use packages from both the environment and the venv. (cherry picked from commit 234bb31)
When building a python environment's bin directory, we now detect wrapped python scripts from installed packages, and generate unwrapped copies with the environment's python executable as the interpreter. (cherry picked from commit 9611885)
Signed-off-by: magic_rb <[email protected]>
94ebc80
to
0541966
Compare
|
I'm going to cherry-pick this on top of my current system nixpkgs. I'll try to daily drive this for a bit. Meanwhile, I've been coming back to this PR once in a while and thinking about how we could make the situation less jank. I'll publish the resulting review comments shortly. |
@@ -42,7 +42,15 @@ let | |||
if [ -f "$prg" ]; then | |||
rm -f "$out/bin/$prg" | |||
if [ -x "$prg" ]; then | |||
makeWrapper "$path/bin/$prg" "$out/bin/$prg" --set NIX_PYTHONPREFIX "$out" --set NIX_PYTHONEXECUTABLE ${pythonExecutable} --set NIX_PYTHONPATH ${pythonPath} ${lib.optionalString (!permitUserSite) ''--set PYTHONNOUSERSITE "true"''} ${lib.concatStringsSep " " makeWrapperArgs} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the only place where pythonPath
was used in this file, so it should probably be removed from the above let-in.
import os | ||
_nix_direct = os.environ.pop("NIX_PYTHON_IN_ENV", None) != "true" | ||
sys.argv[0] = '"'$(readlink -f "$f")'"' if _nix_direct else sys.argv[0] | ||
functools.reduce(lambda k, p: site.addsitedir(p, k), ['"$([ -n "$program_PYTHONPATH" ] && (echo "'$program_PYTHONPATH'" | sed "s|:|','|g") || true)"'], site._init_pathinfo()) if _nix_direct else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I am still kind of conflicted about all this wrapping/unwrapping nonsense.
It seems to me that the sys.argv
and site.addsitedir
shenanigans here are basically doing the same thing as the removed sitecustomize.py
. So shouldn't we also remove this "preamble" and instead just set substitutions.executable
to an instance of python
that already has all the required dependencies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've thought about it some more, and it seems to me that the primary reason why we can't do this is that it would create a circular reference.
Ideally, instead of doing site.addsitedir
at runtime, the shebang in /nix/store/aaaa-python3.12-foo-1.0.0/bin/foo
should just be replaced with #!/nix/store/bbbb-python3-3.12.8-env/bin/python
. Where bbbb
is an instance of python that would just "pick up" the correct environment (from /nix/store/bbbb-python3-3.12.8-env/lib/python3.12/site-packages
).
However, /nix/store/bbbb-python3-3.12.8-env/lib/python3.12/site-packages
would also need to contain symlinks to /nix/store/aaaa-python3.12-foo-1.0.0/lib/python3.12/site-packages/libfoo
(because bin/foo
also needs libfoo
). But this is not possible since this would create a circular dependency aaaa (bin/foo) -> bbbb (bin/python) -> bbbb (lib/*/site-packages) -> aaaa (lib/*/site-packages/libfoo)
.
I think that this could be solved without creating any circular references if #170577 (or something like it) were to be implemented. Alternatively, the whole environment could be included inside the aaaa
derivation (something like /nix/store/aaaa-python3.12-foo-1.0.0/nix-support/env/...
).
Honestly, all of this is kind of yucky 😒 (less yucky than the current situation, of course).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You lost me :) i don't know nearly enough about python to effectively and correctly implement your suggestions. I went to fix the immediate problem if venvs being slightly broken, which I managed. I can implement the magic comment approach but as for fixing up the PR you linked, its probably beyond me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, @cwp and some other people were also considering removing the wrapping/unwrapping logic, but it was never actually implemented. My comments above were mainly my way of documenting my thought process for why removing this extra wrapping is more complicated than it might seem at first glance.
So yeah, this is mostly me talking to myself, but also kind of hoping that somebody with a better understanding of the current python3Packages
machinery might swing by and sanity check my thoughts.
Removing the preamble-wrap thing should probably be beyond the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah no worries feel free to think out-loud, it helps and your input is very welcome. Might be good merge this, if no one has objections so that we have a partial if cursed solution already in. Dunno if anyone who knows enough about the python infra will show up soon.
if [ -f ".$prg-wrapped" ] && ( cat ".$prg-wrapped" | head -n 1 | grep -q "python" ) ; then | ||
echo "#!${pythonExecutable}" >> "$out/bin/$prg" | ||
echo "import os" >> "$out/bin/$prg" | ||
echo 'os.environ["NIX_PYTHON_IN_ENV"] = "true"' >> "$out/bin/$prg" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if we assume that we really DO need to wrap/unwrap python scripts, I still don't like that we have to write to the environment here simply to disable a piece of code a few lines later in the same file.
We are using the magic NIX_PYTHON_IN_ENV
variable here, but the only other place this variable is used in nixpkgs is in the "preamble" in wrap-python.nix
. Why do we need to touch the environment (or search the correct line to delete like in the previous iteration of this code)?
Since we control both the wrapping and the unwrapping logic, couldn't we massively simplify all this. Keep the preamble as it was before (always set sys.argv[0]
and call site.addsitedir
), but add a "magic comment" to the end (say, # NIX_PYTHON_ENV_PREAMBLE
). Then, in the unwrapping code, just delete the line with the magic comment (sed '/NIX_PYTHON_ENV_PREAMBLE/d'
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it might be better to replace the old shebang with the new one, instead of just prepending the new one. Prepending the new shebang means that the resulting derivation contains references to both the old and the new python executable, which might be confusing/suboptimal in terms of why-depends/requisites/closures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right the magic comment thing would work and is quite easy to do. I'm personally hesitant to do it that way because deleting anything carries the risk of breaking things. Could be that at some point something else in nixpkgs wraps a wrapped python executable and then this code tries to undo the preamble which proceeds to delete the other wrap because it ended up on the same line. I know it's very unlikely but it just doesn't seem like a clean approach, neither is my solution, it's awful in a different way.
As for the still remaining reference, yeah you're right I didn't think of that. With my solution it's not possible to get rid of it.
Thanks for the daily drive, I don't use any python things on my laptop sadly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be that at some point something else in nixpkgs wraps a wrapped python executable and then this code tries to undo the preamble which proceeds to delete the other wrap because it ended up on the same line.
I don't think that this is possible, because that would imply that this hypothetical wrapping code inserts the code onto the same line with already existing code (which is broken anyway) instead of inserting a new line (like this wrapping code does).
If anything, I think that my proposed solution might be slightly more robust, because it would actually verify (by checking for the presence of the magic comment) that the script that we are trying to unwrap was actually wrapped by wrap-python.nix
.
Also, I noticed, that the current code doesn't clean up the .$prg-wrapped
scripts after replacing $prg
with the proper "unwrapped" version.
As for the still remaining reference, yeah you're right I didn't think of that. With my solution it's not possible to get rid of it.
I think, that once we've verified that the NIX_PYTHON_ENV_PREAMBLE
comment is present in the file, removing the old shebang line should be 100% safe, since the shebang line must be the first line in the script for it to function as a shebang in the first place.
Please, let me know if you are planning on adding any of the changes that I've suggested in the previous review. No pressure, if you are busy or unwilling to add these changes (I understand your reasoning for implementing the current approach, even if I still think that we ought to try to remove as much of this wrapping/unwrapping cruft as possible). I'd just like to know if it's worth waiting for you to add these changes or if I should make my own PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, now that I started thinking about the .$prg-wrapped
scripts, I am suddenly unsure if the current "unwrapping" implementation is 100% correct.
The "wrapping" performed in wrap-python.nix
consists of two parts:
$prg
- is a bash script produced bywrapProgram
that add stuff to$PATH
, setsPYTHONNOUSERSITE
and then calls.$prg-wrapped
(2).$prg-wrapped
- is a patched version of the original python script that is being wrapped, but with a magic preamble that setsargv[0]
back to$prg
(1) and sets up the site dir appropriately
The current "unwrapping" logic replaces $prg
with .$prg-wrapped
, but with the argv[0]
and site dir logic disabled and with a different shebang (pointing to the python interpreter with the correct environment setup).
However, this might be subtly wrong, since it only accounts for (2) in the original wrapping logic. At the very least, PYTHONNOUSERSITE
is no longer being set. Additionally, PATH
isn't being set correctly (although it seems that this problem is present in the current master anyway). I think that ideally, the currently constructed bin
folder should be added to the PATH
.
This is a continuation of #326094 by @mcdonc which is itself a continuation of #297628 by @cwp. I approached the disabling of the preamble differently than those two previous PRs, namely my solution relies on prefixing the preamble with a setting of the
NIX_PYTHON_IN_ENV
environment variable, which causes the preamble to disable itself and then delete the injected variable. This way it effectively deletes it, without having to do textual matching on python code. I've tested this using the following snippet:result/bin/meson
is functional, so isresult/bin/python
in whichimport zope.interface
andimport mesonbuild
works.meson
itself also relies on Python while building, so it will catch any problems sooner. I'll next build pretalx and verify that works, though I doubt I managed to break that.Furthermore I ran and verified the showcased commands from #297628 still work:
Finally finished the pretalx build
and it also launches