[Python-Dev] Python-3.0, unicode, and os.environ

Guido van Rossum guido at python.org
Sun Dec 7 22:33:57 CET 2008


On Sun, Dec 7, 2008 at 1:20 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> Toshio Kuratomi wrote:
>
>>  - If this is true, a definition of os.listdir(<type 'str'>) that would
>> better meet programmer expectation would be: "Give me all files in a
>> directory with the output as str type".  The definition of
>> os.listdir(<type 'bytes'>) would be "Give me all files in a directory
>> with the output as bytes type".  Raising an exception when the filenames
>> are undecodable is perfectly reasonable in this situation.
>
> Your examples (snipped) pretty well convince me that there is a use case for
> raising exceptions.  We should move beyond arguing over which one way is
> right.  I think there should be a second argument 'ignorebad=False' to
> ignore undecodable files rather than raise the exception (or 'strict=True'
> to stop and raise exception on non-decodable names -- then code is 'if
> strict: raise ...').  I believe other functions have a similar parameter.

If you want the exceptions, just use the bytes API and try to decode
the byte strings using the system encoding.

My problem with raising exceptions *by default* when an undecodable
name exists is that it may render an app completely useless in a
situation where the developer is no longer around. This happened all
the time with the 2.x Unicode API, where the developer hadn't
anticipated a particular input potentially containing non-ASCII bytes,
and the user fed the application non-ASCII text. Making os.listdir
raise an exception when a directory contains a single undecodable file
means that the entire directory can't be read, and most likely the
entire app crashes at that point. Most likely the developer never
anticipated this situation (since in most places it is either
impossible or very unlikely) -- after all, if they had anticipated it
they would have used the bytes API in the first place. (It's worse
because the exception being raised would be UnicodeError -- most
people expect os.listdir to raise OSError, not other errors.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list