[Python-Dev] [Python-3000] Proposed Python 3.0 schedule
James Y Knight
foom at fuhm.net
Wed Oct 8 00:22:13 CEST 2008
On Oct 7, 2008, at 4:45 PM, Adam Olsen wrote:
> So what does Qt do when given a file name already using those PUA?
> Looks like they get passed through untouched when decoded, but will
> get translated into invalid names upon encoding.
Well, I'd say that looks like a bug. It should probably decode those
PUA characters as if they were undecodeable sequences so that they too
roundtrip properly.
> So you still have
> file names you can't open
In practical terms, I suspect nobody has ever run into a file which
has this problem. You certainly can't say that is the case for
Python-3's current behavior; my suspicion is that anyone who uses any
non-ascii filenames at all will run into issues with Python3's
behavior at least once.
> , and you're incompatible with what other
> libraries do.
I'm sure there's a situation where that matters, but, at least I can
run kpdf /any/arbitrary/file.pdf and have it work. And use the KDE
file chooser, and have it able to browse my files, and choose any
file, no matter what random characters it has in it. If there is an
issue with interfacing to another library, the string can be converted
to whatever the other library expects at the interface point...
People keep claiming that odd filenames are only going to be an issue
for "backup tools", but I don't think that's true. I think it'll be an
issue for most any program that reads user-specified files. Whether it
be by running Python in an ASCII (e.g. "C") locale when there are
files created with UTF-8 names, or by having copied/downloaded a file
with an incorrectly encoded name, it's going to come up, and be an
irritant when it does.
That Qt felt the need to make this change rather strengthens that
point IMO...
> The only thing going for Qt is that they seem specifically interested
> in latin-1, rather than arbitrary bad names. The latin-1 strings that
> would correspond to the UTF-8 PUA used would include at least one
> control character, as well as other unusual bits, so it's pretty
> unlikely to encounter a real latin-1 file name like that.
I'd say they're most concerned about files that their users are likely
to run into, yes.
James
More information about the Python-Dev
mailing list