[Python-3000] [Python-Dev] Filename as byte string in python 2.6 or 3.0?

James Y Knight foom at fuhm.net
Tue Sep 30 20:25:54 CEST 2008


On Sep 30, 2008, at 12:57 PM, Guido van Rossum wrote:

>> And again: if utf-8b isn't acceptable, because it does break things  
>> in some
>> unknown-to-me way, I really can't imagine anything working but just  
>> going
>> back to byte-string access as the only API. It's really not okay  
>> for the
>> "obvious" APIs to be totally broken by unexpected input. Think  
>> os.getcwd(),
>> sys.argv, os.environ. You can't just ignore bad files and call it  
>> done.
>
> Actually that is what you *have* to do with the
> filesystem-as-a-black-box model. Filesystems reserve the right to fail
> occasionally and there's nothing you can do to prevent it -- it would
> be unacceptable if the entire disk would stop working because it had
> one bad block (unless the bad block is in some kind of master table)
> so you just have to deal with it, and you can't wish the problems away
> by insisting on a perfect abstraction.

What I meant is that ignoring certain files not nearly good enough to  
solve the problem.

python -c "import sys; print sys.argv" "$(echo -e 'filename\x90\x90')"  
-> python3 fails to start.

cd "$(echo -e 'dir\x90')" # Assume said dir exists
python -> python3 fails to start.

PATH="$PATH:$(echo -e /home/user/dir\x90)"
python3 -c "import os; print os.environ['PATH']" -> nope, no PATH.

Those aren't good behaviors, and can't be solved simply by pretending  
certain files don't exist.

But please see the U+0000-escape alternative proposed by Marcin. It,  
unlike utf-8b doesn't depend upon non-standard unicode, so maybe there  
won't be as much opposition to it.

James


More information about the Python-3000 mailing list