[ python-Bugs-850997 ] mbcs encoding ignores errors

SourceForge.net noreply at sourceforge.net
Fri Nov 28 20:24:21 EST 2003


Bugs item #850997, was opened at 2003-11-29 12:24
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=850997&group_id=5470

Category: Windows
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Mark Hammond (mhammond)
Assigned to: Thomas Heller (theller)
Summary: mbcs encoding ignores errors

Initial Comment:
The following snippet:

>>> u'@test-\u5171'.encode("mbcs", "strict")
'@test-?'

Should raise a UnicodeError.  The errors param is
completely ignored, and the function always works as
though errors='replace'.

Attaching a test case, and the start of a patch.  The
patch has a number of issues:
* I'm not sure what errors are considered 'mandatory'.
 I have handled 'strict', 'ignore' and 'replace' -
however, 'ignore' and 'replace' currently are exactly
the same (ie, replace)
* The Windows functions don't tell us exactly what
character failed in the conversion.  Thus, the
exception I raise implies the first character is the
one that failed.  For the same reason, I have made no
attempt to support error callbacks.

Comments/guidance appreciated.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=850997&group_id=5470



More information about the Python-bugs-list mailing list