[issue36061] zipfile does not handle arcnames with non-ascii characters on Windows

Serhiy Storchaka report at bugs.python.org
Thu Feb 21 00:40:42 EST 2019


Serhiy Storchaka <storchaka+cpython at gmail.com> added the comment:

You can not just add .decode('cp437') to arcname.

1. This will fail if the ZIP archive contains file names encoded with UTF-8. They are already unicode and contains non-ascii characters. For decode() they will be implicit encoded to str, that will fail.

2. This will fail when targetpath is a 8-bit string containing non-ascii characters. Currently this works (maybe incorrectly).

3. While cp437 is the only official encoding in ZIP archives if UTF-8 is not used, de facto different encodings (like cp866) are used on localized Windows.

Fixing the problem without introducing other problems and breaking existing working code is hard. One possible solution is using Python 3.

I suggest to close this issue as "won't fix".

----------
nosy: +serhiy.storchaka

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue36061>
_______________________________________


More information about the Python-bugs-list mailing list