[Python-Dev] Python-3.0, unicode, and os.environ
Steven D'Aprano
steve at pearwood.info
Sat Dec 6 03:06:40 CET 2008
On Sat, 6 Dec 2008 11:48:27 am Nick Coghlan wrote:
> Toshio Kuratomi wrote:
> > Nick Coghlan wrote:
...
> >> Why? Most programs won't be able to do anything with it. And if
> >> the program *can* do something with it... that's what the bytes
> >> version of the APIs are for.
> >
> > Nonsense. A program can do tons of things with a non-decodable
> > filename. Where it's limited is non-decodable filedata.
>
> You can't display a non-decodable filename to the user, hence the
> user will have no idea what they're working on. Non-filesystem
> related apps have no business trying to deal with insane filenames.
I don't agree. Putting my user's hat on, I know what I would expect: the
app should display *some* name, it doesn't matter exactly what, so long
as:
* it's as close as possible to the "real" name;
* it is unique in that directory (doesn't shadow another file); and
* it's enough to identify the file so I can read/save/delete/rename the
file.
I think there are analogous situations: long-time Windows users will be
used to seeing files listed as "longfilename.txt" in some applications
and "longfi~1.txt" in another. Under POSIX, file names can contain
unprintable ctrl characters, and the shell will print them at least
three ways, depending on context. E.g. for a file containing a
formfeed, I get one of ? \f or ^L in bash.
Applications can deal with such weird file names. KDE's file manager
(konqueror) and file selection dialog both show the character as a
small square, presumably the font's missing character glyph, and KDE
apps can open and save the file. Still speaking as a user, I think it
is quite reasonable to expect applications to deal with undisplayable
filenames: displaying the name and opening the file are orthogonal
concepts, although I accept that command-line interfaces will have
difficulty with file names that can't be typed by the user!
I appreciate that broken unicode is more difficult to deal with than
unprintable control characters, but the basic principle is the same.
--
Steven
More information about the Python-Dev
mailing list