[New-bugs-announce] [issue12281] bytes.decode('mbcs', 'ignore') does replace undecodable bytes on Windows Vista or later
STINNER Victor
report at bugs.python.org
Tue Jun 7 23:48:05 CEST 2011
New submission from STINNER Victor <victor.stinner at haypocalc.com>:
Starting at Python 3.2, the MBCS codec uses MultiByteToWideChar() to decode bytes using flags=MB_ERR_INVALID_CHARS by default (strict error handler), flags=0 for the ignore error handler, and raise a ValueError for other error handlers.
The problem is that the meaning of flags=0 changes with the Windows version:
- ignore undecodable bytes until Windows XP
- *replace* undecodable bytes for Windows Vista and later
We should accept "replace" error handler with flags=0, at least on Windows Vista and later.
I don't know if we should only accept "ignore" on Windows <= XP and only "error" on Windows >= Vista, or if the difference should be documented.
----------
components: Unicode
messages: 137885
nosy: amaury.forgeotdarc, haypo
priority: normal
severity: normal
status: open
title: bytes.decode('mbcs', 'ignore') does replace undecodable bytes on Windows Vista or later
versions: Python 3.2, Python 3.3
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12281>
_______________________________________
More information about the New-bugs-announce
mailing list