[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

Tue Apr 28 02:48:09 CEST 2009

On 27Apr2009 23:27, Simon Cross <hodgestar+pythondev at gmail.com> wrote:
| On Mon, Apr 27, 2009 at 9:48 PM, "Martin v. Löwis" <martin at v.loewis.de> wrote:
| > As Cameron says: it's out of the scope of the PEP. It really depends how
| > the operating system deals with them. Most likely, the files are not
| > accessible - not only not from Python, but also not accessible from
| > any other Unix program. Details depend on the specific operating system
| > software being used, and the specific parameters passed to it.
| 
| $ touch $'\xFF\xAA\xFF'
| $ vi $'\xFF\xAA\xFF'
| $ egrep foo $'\xFF\xAA\xFF'
| 
| All worked fine from my Bash shell with locale encoding set to UTF-8.
| I can also open the created file from the GNOME editor file dialog (it
| even tells me the filename is not valid in my locale's encoding). The
| Nedit editor also worked. So far I haven't found anything that failed.

Yes, they would. Are you doing that on a real UNIX filesystem
(ext2/3/4, XFS etc)?

I'm not sure whether you're arguing for or against the propsal here,
btw.

This would make a file with a presumably UTF-8-invalid name. Martin's
proposal would cheerfully map that losslessly to a string. Is there a
problem here?
-- 
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

Stepwise Refinement n.  A sequence of kludges K, neither distinct or finite,
applied to a program P aimed at transforming it into the target program Q.