UnicodeEncodeError in compile

jmfauth wxjmfauth at gmail.com
Wed Jan 11 04:29:26 EST 2012


On 11 jan, 01:56, Terry Reedy <tjre... at udel.edu> wrote:
> On 1/10/2012 8:43 AM, jmfauth wrote:
>
>
>
> > D:\>c:\python32\python.exe
> > Python 3.2.2 (default, Sep  4 2011, 09:51:08) [MSC v.1500 32 bit
> > (Intel)] on win
> > 32
> > Type "help", "copyright", "credits" or "license" for more information.
> >>>> '\u5de5'.encode('utf-8')
> > b'\xe5\xb7\xa5'
> >>>> '\u5de5'.encode('mbcs')
> > Traceback (most recent call last):
> >    File "<stdin>", line 1, in<module>
> > UnicodeEncodeError: 'mbcs' codec can't encode characters in position
> > 0--1: inval
> > id character
> > D:\>c:\python27\python.exe
> > Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit
> > (Intel)] on win
> > 32
> > Type "help", "copyright", "credits" or "license" for more information.
> >>>> u'\u5de5'.encode('utf-8')
> > '\xe5\xb7\xa5'
> >>>> u'\u5de5'.encode('mbcs')
> > '?'
>
> mbcs encodes according to the current codepage. Only the chinese
> codepage(s) can encode the chinese char. So the unicode error is correct
> and 2.7 has a bug in that it is doing "errors='replace'" when it
> supposedly is doing "errors='strict'". The Py3 fix was done inhttp://bugs.python.org/issue850997
> 2.7 was intentionally left alone because of back-compatibility
> considerations. (None of this addresses the OP's question.)
>
> --

Ok. I was not aware of this.
PS Prev. post gets lost.




More information about the Python-list mailing list