[Python-3000] [Python-Dev] Filename as byte string in python 2.6 or 3.0?
James Y Knight
foom at fuhm.net
Tue Sep 30 20:25:54 CEST 2008
On Sep 30, 2008, at 12:57 PM, Guido van Rossum wrote:
>> And again: if utf-8b isn't acceptable, because it does break things
>> in some
>> unknown-to-me way, I really can't imagine anything working but just
>> going
>> back to byte-string access as the only API. It's really not okay
>> for the
>> "obvious" APIs to be totally broken by unexpected input. Think
>> os.getcwd(),
>> sys.argv, os.environ. You can't just ignore bad files and call it
>> done.
>
> Actually that is what you *have* to do with the
> filesystem-as-a-black-box model. Filesystems reserve the right to fail
> occasionally and there's nothing you can do to prevent it -- it would
> be unacceptable if the entire disk would stop working because it had
> one bad block (unless the bad block is in some kind of master table)
> so you just have to deal with it, and you can't wish the problems away
> by insisting on a perfect abstraction.
What I meant is that ignoring certain files not nearly good enough to
solve the problem.
python -c "import sys; print sys.argv" "$(echo -e 'filename\x90\x90')"
-> python3 fails to start.
cd "$(echo -e 'dir\x90')" # Assume said dir exists
python -> python3 fails to start.
PATH="$PATH:$(echo -e /home/user/dir\x90)"
python3 -c "import os; print os.environ['PATH']" -> nope, no PATH.
Those aren't good behaviors, and can't be solved simply by pretending
certain files don't exist.
But please see the U+0000-escape alternative proposed by Marcin. It,
unlike utf-8b doesn't depend upon non-standard unicode, so maybe there
won't be as much opposition to it.
James
More information about the Python-3000
mailing list