[issue16310] zipfile: allow surrogates in filenames

Stefan Holek report at bugs.python.org
Tue Oct 30 11:06:21 CET 2012


Stefan Holek added the comment:

> It's possible to distribute Python packages with non-ASCII filenames.

Well, it wasn't until very recently (distribute 0.6.29):
https://bitbucket.org/tarek/distribute/issue/303/no-support-for-unicode-manifest-files
Unless we are not talking about the same thing, which is possible. ;-)

>> So yes, I have Latin-1 bytes on the filesystem,
>> even though my locale is UTF-8.

> You system is not configured correctly. If you would like to distribute such invalid filename,
> how do you plan to access it on other platforms where the filename is decoded differently?
> It would be safer to build your project on a well configured system.

This was done on purpose, to test how Python fares. Such files can easily come into existence, e.g. when cloning a Git repo created on a different system. I am not after "correct" ZIP files in this case, I am after Python not raising UnicodeErrors when it is supposed to a) support non-ASCII module names and b) support surrogates.

    python setup.py sdist --formats=gztar -> works

    python setup.py sdist --formats=zip -> UnicodeError

If I am the only one to think this is wrong, then so be it. Our current workaround is to disallow surrogates in the manifest. /me shrugs.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue16310>
_______________________________________


More information about the Python-bugs-list mailing list