Skip to content

zipapp fails cryptically on large ZIP64-formatted archives because zipimport.py doesn't support ZIP64. #95706

Closed
@thundergolfer

Description

@thundergolfer

Bug report

When using Zipapp (and other related projects) to create standalone Python applications my colleagues and I ran into trouble using any large but valid .zip archive.

Turns out Lib/zipimport.py doesn't support ZIP64, and 'gets lost' when reading ZIP64 archives causing it to find 0 files in these archives and report that a __main__.py cannot be found.

Minimal reproduction

Create a Zip64 archive file using a simple __main__.py and random binary data to pad out the size.

"""
Use the `zipapp` module to write a Zip64 archive to disk.
(Alternatively the `zipfile` module can be used directly.)
"""
import os
import pathlib
import tempfile
import zipapp
import zipfile

def main() -> int:
    num_dummy_files = 10
    dummy_file_size = int((1.5 * zipfile.ZIP64_LIMIT) // num_dummy_files)
    temp_dir = tempfile.TemporaryDirectory()
    for i in range(num_dummy_files):
        with open(pathlib.Path(temp_dir.name, f"{i}.bin"), "wb") as dummy_f:
            dummy_f.write(os.urandom(dummy_file_size))
    with open(pathlib.Path(temp_dir.name, "__main__.py"), "w") as main_f:
        main_f.write("print('Hello from the zipapp __main__py!')")

    zipapp.create_archive(temp_dir.name, "zip64_sized.pyz")
    temp_dir.cleanup()
    return 0

if __name__ == "__main__":
    raise SystemExit(main())

Attempt to execute the large zipapp.

python3.11 zip64_sized.pyz
/workspaces/cpython/python: can't find '__main__' module in '/workspaces/cpython/zip64_sized.pyz'
# or, using interpreters compiled from latest `main` (698fa8bf)
./python zip64_size.pyz
/usr/local/bin/python3.11: can't find '__main__' module in '/workspaces/cpython/zip64_sized.pyz'

The __main__ module is of course present in the archive, which prompts head scratching until you did into the cPython source and ZIP file spec.

How to fix

The zipapp module will happily produce Zip64 archives because the underlying zipfile module has defaulted Zip64 support since Python 3.4.

The 'full' fix for this issue would be to refactor Lib/zipimport.py to support Zip64 loading.

A first fix I think could be just providing a clearer error message when Lib/zipimport.py is given a Zip64 archive.

I'm happy to provide patches for each of these fixes in turn, if there's support for it. :)

Edit: Began attempting to raise an exception on Zip64 archives, but it seems on raising an exception within zipimport.py the program doesn't exit and instead continues to start the interpreter:

Traceback (most recent call last):
  File "<frozen zipimport>", line 91, in __init__
ValueError: ZIP64 archives are unsupported
SyntaxError: Non-UTF-8 code starting with '\xff' in file /workspaces/cpython/zip64_sized.pyz on line 2, but no encoding declared; see https://peps.python.org/pep-0263/ for details

Your environment

- CPython versions tested on:

  • Python 3.12.0a0 (heads/main:698fa8bf60, Aug 5 2022, 08:59:06) [GCC 9.4.0] on linux
  • Python 3.11.0b5+ (heads/3.11:8570f6d1a0, Aug 2 2022, 07:52:11) [GCC 9.4.0] on linux
  • Python 3.10.4 (main, Apr 1 2022, 20:52:12) [GCC 9.4.0] on linux

- Operating system and architecture:

uname -a
Linux codespaces-5a1930 5.4.0-1086-azure #91~18.04.1-Ubuntu SMP Thu Jun 23 20:33:05 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Related:

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions