[Python-Dev] Bytes path support

Marko Rauhamaa marko at pacujo.net
Thu Aug 21 15:58:03 CEST 2014


"Martin v. Löwis" <martin at v.loewis.de>:

> I think the people defending the "Unix file names are just bytes" side
> often miss an important detail: displaying file names to the user, and
> allowing the user to enter file names.

The user interface is a real issue and needs to be addressed. It is
separate from the OS interface, though.

> A script that just needs to traverse a directory tree and look at
> files by certain criteria can easily do so with not worrying about a
> text interpretation of the file names.

A single system often has file names that have been encoded with
different schemes. Only today, I have had to deal with the JIS character
table (<URL:
http://i.msdn.microsoft.com/cc305152.932%28en-us,MSDN.10%29.gif>) -- you
will notice that it doesn't have a backslash character. A coworker uses
ISO-8859-1.

I use UTF-8. UTF-8, of course, will refuse to deal with some byte
sequences.

My point is that the poor programmer cannot ignore the possibility of
"funny" character sets. If Python tried to protect the programmer from
that possibility, the result might be even more intractable: how to act
on a file with an non-UTF-8 filename if you are unable to express it as
a text string?


Marko


More information about the Python-Dev mailing list