Description
Bug report
The ZipInfo object within ZipFile performs an explicit translation of the filename.
Lines 378 to 382 in 7db1d2e
I believe this is intended to make it easy to use on Windows where you might pass an explicit pathname to the ZipInfo object creation. On Windows the filesystem separator is commonly \
(although it supports /
in many cases), so that this foces the the filename
attribute to contain a filename in the unix form.
This logic is used whether the ZipInfo object is created manually (usually to add a new file), or when the filename has been taken from an archive that is being extracted.
However, the logic is broken on systems where the os.sep
is anything else other than \
or /
. On systems where the os.sep
is .
this means that if you try to create an archive with a file containing a .
extension the filename in the archive will be mangled. On such a system, extracting an archive will also mangle the filename.
To demonstrate this, it is possible to do a very simple command line example:
>>> import zipfile
>>> import os
>>> os.sep = '.'
>>> zipfile.ZipInfo('hello.txt')
<ZipInfo filename='hello/txt' file_size=0>
In the real world, this breaks any possibility of using this module on RISC OS where the filesystem separator in os.sep
is .
. In the current Python 3 on RISC OS, the ZipFile module will always mangle filenames that have standard extensions.
I believe that the intention of the object is that:
- the
filename
initialiser on the object and attribute is in unicode format (this has been enforced since Python 3 by the explicit decodes in the archive member reading code). - the
filename
attribute is formed as would be stored in the archive, using/
as a directory separator (stated by documentationfilename should be the full name of the archive member
). - the
filename
initialiser on the object is allowed to be supplied a path name on unix and windows systems, as a convenience (the referenced code will have been relied on by existing software).
As such, I believe the referenced code is broken, and to retain the above assumptions and to allow the handling of zip archives on systems where os.sep
is not /
or \
, the code should instead read:
if os.sep == "\\" and os.sep in filename:
filename = filename.replace(os.sep, "/")
This removes the overzealous replacement of os.sep
in the creation of the ZipInfo object.
Further problems exist with the from_file
method which I shall raise separately.
There are some issues which might be related to this (but this change does not preclude them): #90139 and #92184.
Your environment
- CPython versions tested on: Python 3.9, 3.10
- Operating system and architecture: On OS X, simulating the problem seen on RISC OS.
Metadata
Metadata
Assignees
Projects
Status