[New-bugs-announce] [issue12281] bytes.decode('mbcs', 'ignore') does replace undecodable bytes on Windows Vista or later

STINNER Victor report at bugs.python.org
Tue Jun 7 23:48:05 CEST 2011


New submission from STINNER Victor <victor.stinner at haypocalc.com>:

Starting at Python 3.2, the MBCS codec uses MultiByteToWideChar() to decode bytes using flags=MB_ERR_INVALID_CHARS by default (strict error handler), flags=0 for the ignore error handler, and raise a ValueError for other error handlers.

The problem is that the meaning of flags=0 changes with the Windows version:

 - ignore undecodable bytes until Windows XP
 - *replace* undecodable bytes for Windows Vista and later

We should accept "replace" error handler with flags=0, at least on Windows Vista and later.

I don't know if we should only accept "ignore" on Windows <= XP and only "error" on Windows >= Vista, or if the difference should be documented.

----------
components: Unicode
messages: 137885
nosy: amaury.forgeotdarc, haypo
priority: normal
severity: normal
status: open
title: bytes.decode('mbcs', 'ignore') does replace undecodable bytes on Windows Vista or later
versions: Python 3.2, Python 3.3

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12281>
_______________________________________


More information about the New-bugs-announce mailing list