gh-118107: Fix zipimporter ZIP64 handling.#118108
Conversation
The code that handles too large files and offsets was missing an import.
itamaro
left a comment
There was a problem hiding this comment.
thank you for catching this and the quick patch!
could you add a test case that covers this code path please?
@itamaro Absolutely. In Pex my test builds a >4GB file to add to the zip to trigger this (I run the test if |
That's a good point! |
|
right, I think either using |
@Eclips4 and @itamaro how about I add a test in this PR with code similar to that large file test setup code you pointed me at with an eye to parameterization of the file size and then if I get another positive ACK that that code should be moved to support and de-deduped, then I'll detour to pull it out to a new PR? |
sounds good to me |
| self.doTest(".py", files, "f65536", comment=b"c" * ((1 << 16) - 1)) | ||
|
|
||
| def testZip64LargeFile(self): | ||
| support.requires( |
There was a problem hiding this comment.
I had introduced the decorator from Lib/test/test_largefile.py and then stumbled upon the resources support (-u) that seems more widely used; so I went with it instead. The tests only run (and pass) with -ulargefile:
./python -m test test_zipimport -ulargefile -v -m testZip64LargeFile
== CPython 3.13.0a6+ (heads/fix-issue-118107:8d55e2f226, Apr 19 2024, 20:12:18) [GCC 11.4.0]
== Linux-5.15.146.1-microsoft-standard-WSL2-x86_64-with-glibc2.35 little-endian
== Python build: debug
== cwd: /home/jsirois/dev/python/cpython/build/test_python_worker_191158æ
== CPU count: 16
== encodings: locale=UTF-8 FS=utf-8
== resources (1): largefile
Using random seed: 3331549713
0:00:00 load avg: 3.80 Run 1 test sequentially
0:00:00 load avg: 3.80 [1/1] test_zipimport
testZip64LargeFile (test.test_zipimport.CompressedZipImportTestCase.testZip64LargeFile) ... ok
testZip64LargeFile (test.test_zipimport.UncompressedZipImportTestCase.testZip64LargeFile) ... ok
----------------------------------------------------------------------
Ran 2 tests in 42.905s
OK
test_zipimport passed in 43.0 sec
== Tests result: SUCCESS ==
1 test OK.
Total duration: 43.0 sec
Total tests: run=2 (filtered)
Total test files: run=1/1 (filtered)
Result: SUCCESS
| values = struct.unpack_from(f"<{min(num_extra_values, 3)}Q", | ||
| extra_data, offset=4) | ||
| import struct | ||
| values = list(struct.unpack_from(f"<{min(num_extra_values, 3)}Q", |
There was a problem hiding this comment.
The new test caught the need for the list here. Prior to the fix you'd see:
ERROR: testZip64LargeFile (test.test_zipimport.UncompressedZipImportTestCase.testZip64LargeFile)
----------------------------------------------------------------------
Traceback (most recent call last):
File "<frozen importlib._bootstrap_external>", line 1510, in _path_importer_cache
KeyError: '/home/jsirois/dev/python/cpython/build/test_python_worker_140690æ/junk95142.zip'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/jsirois/dev/python/cpython/Lib/test/test_zipimport.py", line 89, in wrapper
return fun(*args, **kwargs)
~~~^^^^^^^^^^^^^^^^^
File "/home/jsirois/dev/python/cpython/Lib/test/test_zipimport.py", line 89, in wrapper
return fun(*args, **kwargs)
~~~^^^^^^^^^^^^^^^^^
File "/home/jsirois/dev/python/cpython/Lib/test/test_zipimport.py", line 849, in testZip64LargeFile
self.doTestWithPreBuiltZip(".py", "large_file")
~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
File "/home/jsirois/dev/python/cpython/Lib/test/test_zipimport.py", line 157, in doTestWithPreBuiltZip
mod = importlib.import_module(".".join(modules))
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
File "/home/jsirois/dev/python/cpython/Lib/importlib/__init__.py", line 88, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
File "<frozen importlib._bootstrap>", line 1322, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 1262, in _find_spec
File "<frozen importlib._bootstrap_external>", line 1553, in find_spec
File "<frozen importlib._bootstrap_external>", line 1525, in _get_spec
File "<frozen importlib._bootstrap_external>", line 1512, in _path_importer_cache
File "<frozen importlib._bootstrap_external>", line 1488, in _path_hooks
File "<frozen zipimport>", line 98, in __init__
File "<frozen zipimport>", line 528, in _read_directory
AttributeError: 'tuple' object has no attribute 'pop'
There was a problem hiding this comment.
another good catch! glad you added the test case!
| values = struct.unpack_from(f"<{min(num_extra_values, 3)}Q", | ||
| extra_data, offset=4) | ||
| import struct | ||
| values = list(struct.unpack_from(f"<{min(num_extra_values, 3)}Q", |
There was a problem hiding this comment.
another good catch! glad you added the test case!
Misc/NEWS.d/next/Library/2024-04-19-09-28-43.gh-issue-118107.Mdsr1J.rst
Outdated
Show resolved
Hide resolved
…dsr1J.rst Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
The code to generate the test data is included in the test itself and will run when testdata is not found.
|
Alrighty @gpshead, I think this is good to go. |
encukou
left a comment
There was a problem hiding this comment.
The checked-in files contain mtimes, so they aren't reproducible. A few bytes were different when I re-ran the command.
I don't think this should block the PR.
|
Thanks! Deleting the files and running the test now gives bit-for-bit identical results. I've added |
Add missing import to code that handles too large files and offsets. Use list, not tuple, for a mutable sequence. Add tests to prevent similar mistakes. --------- Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org> Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
Summary: My fix matches upstream python/cpython#118108 (comment) Reviewed By: itamaro Differential Revision: D72491515 fbshipit-source-id: 8a469cb9c6612997dd42e8a349c947df32a6586a
The code that handles too large files and offsets was missing an import
and was manipulating a tuple as if it was a list.