gh-130379: Fix incorrect zipapp logic to avoid including the target in itself#130509
gh-130379: Fix incorrect zipapp logic to avoid including the target in itself#130509pfmoore merged 8 commits intopython:mainfrom
Conversation
|
@serhiy-storchaka you raised some concerns about the original change (unfortunately I'd merged before I saw them). Would you be willing to take a look at this revised approach? |
serhiy-storchaka
left a comment
There was a problem hiding this comment.
LGTM. This addresses also my concerns.
|
One possible improvement would be to check explicitly for This would catch the case where the user does something like The problem with this check is that it's not complete. A path equality check doesn't cover all cases (symlinks, use of I'm inclined to add the simple path equality check, with a comment explaining that it's not intended to catch every case where the target is part of the source, just to act as a sanity check for this specific common error. Does that seem like a reasonable compromise? |
|
Report an error if the target file overwrites a file in the list of source files to add. See #130379 (comment) for details. |
| # the zipfile module catch writing an archive to itself at a | ||
| # lower level, which could help here in cases that our check | ||
| # doesn't catch. | ||
| if target in files_to_add: |
There was a problem hiding this comment.
target can be a file-like object. target in files_to_add can invoke comparison of a file-like object with the Path objects. Is it what we need?
There was a problem hiding this comment.
I believe it's fine. A file-like object will simply test as not equal to all the Path objects. I guess I could add some sort of exception handling in case the user passes an object with a custom equality check that fails, but that seems like overkill to me.
I was going to add a note to that effect in the comment, but it's long enough already and I wasn't sure it was worth it. The fact that you pointed it out suggests that it would be, though.
Lib/zipapp.py
Outdated
| # thorough checks don't provide enough value to justify the extra | ||
| # cost. | ||
|
|
||
| # https://github.com/python/cpython/issues/104527 tracks making |
There was a problem hiding this comment.
I do not think that zipfile should catch this.
There was a problem hiding this comment.
OK. I'll remove that part of the comment. I think the same logic applies here - if adding the check can't be justified in zipfile, it's not worth adding it here either. We can stick with the simple but incomplete check.
|
@Yhg1s IMO this is a bug fix that should probably be backported to 3.13. The original bug was introduced in #106076, which isn't present in 3.12, so there's no need to backport further. It's technically a behaviour change (the original intent was to skip the offending file rather than error) but given that the intended behaviour didn't work, I think it's better to backport than leave the bug in 3.13. Before I add the backport label, do you have any objections? |
|
I'm going to request a backport. If anyone has any objections, or feels this shouldn't be backported, speak up - it's easy enough to revert if necessary. |
|
Thanks @pfmoore for the PR 🌮🎉.. I'm working now to backport this PR to: 3.13. |
…rget in itself (pythongh-130509) (cherry picked from commit 64ccbbb) Co-authored-by: Paul Moore <p.f.moore@gmail.com>
|
GH-130791 is a backport of this pull request to the 3.13 branch. |
|
@pfmoore Thanks for fixing up my PR. Was a university student when I made that PR, and boy does it show. |
The change in #106076 was intended to ensure that zipapp would not include the target file in the list of files that get included in the target. However the logic to do that was flawed, and expensive. This PR changes the approach to generate the list of files to add before creating the target, so there is no possibility of the target being included in the first place.