[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

Antoine Pitrou solipsis at pitrou.net
Tue Apr 28 15:03:46 CEST 2009


Thomas Breuel <tmbdev <at> gmail.com> writes:
> 
> How can you bring up practical problems against something that hasn't been
implemented?

The PEP is simple enough that you can simulate its effect by manually computing
the resulting unicode string for a hypothetical broken filename. Several people
have already done so in this thread.

> The fact that no other language or library does this is perhaps an indication
that it isn't the right thing to do.

According to some messages, it seems Java and Mono actually use this kind of
workaround. Though I haven't checked (I don't use those languages).

> But the biggest problem with the proposal is that it isn't needed: if you want
to be able to turn arbitrary byte sequences into unicode strings and back, just
set your encoding to iso8859-15.  That already works

That doesn't work at all. With your proposal, any non-ASCII filename will be
unreadable; not only the broken ones.

Antoine.




More information about the Python-Dev mailing list