[New-bugs-announce] [issue15478] UnicodeDecodeError on OSError

STINNER Victor report at bugs.python.org
Sat Jul 28 14:49:05 CEST 2012


New submission from STINNER Victor <victor.stinner at gmail.com>:

On Windows, if an OS error fails, the filename type is bytes and the filename cannot be decoded: Python raises an UnicodeDecodeError instead of an OSError. The problem is that Python decodes the filename to fill OSError.filename field. See the issue #15441 for the initial report.

There are different options to solve this issue:
 - always keep the filename parameter unchanged, so OSError.filename can be a str or a bytes string, depending on the input parameter
 - try to decode the filename from the filesystem encoding, or keep the filename unchanged: OSError.filename is only a bytes string if the filename cannot be decoded
 - don't fill OSError.filename (= None) if the filename cannot be decoded
 - use "surrogateescape", "replace" or "backslashreplace" error handler to decode the filename

This issue is specific to Windows: on other plaforms, the filename is decoded using the "surrogateescape" error handler and so decoding the filename cannot fail.

I don't know if OSError.filename is only used to display more information to the user, or if it is used to do another operation on the file (ex: os.chmod).

I like solutions keeping the filename unchanged, because it does not loose information, and the user can decide how to handle the undecodable filename.

I don't like the option trying to decode the filename or keeping it unchanged it decoding fails, because applications will work in most cases, but "crash" when someone comes with an unusual code page, a special USB key, or a filename with a non-ASCII character.

So the best option is maybe to always keep the bytes filename unchanged.

Such change cannot be done anymore in Python 3.3, it's too late to test it correctly.

----------
components: Unicode, Windows
messages: 166652
nosy: ezio.melotti, flox, haypo, ishimoto, loewis, tim.golden
priority: normal
severity: normal
status: open
title: UnicodeDecodeError on OSError
versions: Python 3.4

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue15478>
_______________________________________


More information about the New-bugs-announce mailing list