[issue28080] Allow reading member names with bogus encodings in zipfile

Stephen J. Turnbull report at bugs.python.org
Sun Sep 11 16:28:40 EDT 2016


Stephen J. Turnbull added the comment:

Re: wait for 3.7 if reviewers are busy, understood.  N.B. Contributor agreement is now on file (I received the PDF from python.org already).

Re: existing patches:
My patch is very similar in the basic approach to Sergey Dorofeev's patch in issue10614.  Main differences:
(1) Sergey's patch treats the "encoding" parameter as a first class citizen with a default to cp437, whereas mine treats it as a special case defaulting to None, with utf-8 and cp437 getting special treatment as the standard encodings.  Subtle point, but I like it this way.
(2) My patch includes support for the argument in the __main__ script.
(3) Sergey's patch misses one execution path in the current code so needs update before application.

The Japanese patches by umedoblock are very Japanese-centric, and worse, they try to guess the encoding by the crude method of seeing what decodes successfully.  They are not acceptable IMO.

Aaargh.  Just noticed the Japanese in test_zipfile.py.  Will change it to use \u escapes soon.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue28080>
_______________________________________


More information about the Python-bugs-list mailing list