[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

"Martin v. Löwis" martin at v.loewis.de
Tue Apr 28 08:50:02 CEST 2009


James Y Knight wrote:
> Hopefully it can be assumed that your locale encoding really is a
> non-overlapping superset of ASCII, as is required by POSIX...

Can you please point to the part of the POSIX spec that says that
such overlapping is forbidden?

> I'm a bit scared at the prospect that U+DCAF could turn into "/", that
> just screams security vulnerability to me.  So I'd like to propose that
> only 0x80-0xFF <-> U+DC80-U+DCFF should ever be allowed to be
> encoded/decoded via the error handler.

It would be actually U+DC2f that would turn into /.
I'm happy to exclude that range from the mapping if POSIX really
requires an encoding not to be overlapping with ASCII.

Regards,
Martin


More information about the Python-Dev mailing list