[Python-Dev] Python-3.0, unicode, and os.environ

glyph at divmod.com glyph at divmod.com
Sun Dec 7 08:05:48 CET 2008


On 06:07 am, a.badger at gmail.com wrote:
>Guido van Rossum wrote:
>>On Sat, Dec 6, 2008 at 10:53 AM,  <glyph at divmod.com> wrote:
>
>>>I find it interesting to note that the only users in this discussion 
>>>who
>>>actually have these problems in real life all have this attitude.

>>For file managers and similar tools I am absolutely 100% in agreement
>>-- that's why the binary APIs are there.

>>Most apps aren't file managers or ftp clients though. The sky is not 
>>falling.

>Most apps aren't file managers or ftp clients but when they interact
>with files (for instance, a file selection dialog) they need to be able
>to show the user all the relevant files.  So on an app-by-app basis the
>need for this is high.

While I tend to agree emphatically with this, the *real* solution here 
is a path-abstraction library.  In separate discussions, the difficulty 
of getting such a thing into the standard library has been discussed, 
due to the wide variety of opinions as to what it should look like (and 
the shocking level of difficulty involved in making such a thing really 
work correctly).

I'd be very happy to talk to you off-list about my ideas for such a 
thing, but I'd rather not resurrect yet another tedious discussion here 
just now :).
>On a code basis, I'd hope that most file
>selection dialogs are pulled out into libraries... but that still
>doesn't help me identify when someone would expect that asking python
>for a list of all files in a directory or a specific set of files in a
>directory should, without warning, return only a subset of them.  In
>what situations is this appropriate behaviour?

If you say listdir(unicode) on a POSIX OS, your program is saying "I 
only know how to deal with unicode results from this function, so please 
only give me those.".  If your program is smart enough to deal with 
bytes, then you would have asked for bytes, no?  Returning only 
filenames which can be properly decoded makes sense.  Otherwise everyone 
needs to learn about this highly confusing issue, even for the simplest 
scripts.

Skipping undecodable values is good enough that it will work 90% of the 
time.  When you need to get to 100%, it won't be impossible - the bytes 
APIs will be there.  In the longer term, hopefully some path abstraction 
will eventually be there too.  We should not wait for a perfectly 
correct path abstraction to arrive before providing the primitives to do 
it yourself, though.


More information about the Python-Dev mailing list