Python and encodings drives me crazy

Oliver Andrich oliver.andrich at gmail.com
Mon Jun 20 18:24:27 EDT 2005


2005/6/21, Konstantin Veretennicov <kveretennicov at gmail.com>:
> It does, as long as headline and caption *can* actually be encoded as
> macroman. After you decode headline from utf-8 it will be unicode and
> not all unicode characters can be mapped to macroman:
> 
> >>> u'\u0160'.encode('utf8')
> '\xc5\xa0'
> >>> u'\u0160'.encode('latin2')
> '\xa9'
> >>> u'\u0160'.encode('macroman')
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "D:\python\2.4\lib\encodings\mac_roman.py", line 18, in encode
>     return codecs.charmap_encode(input,errors,encoding_map)
> UnicodeEncodeError: 'charmap' codec can't encode character u'\u0160' in position
>  0: character maps to <undefined>

Yes, this and the coersion problems Diez mentioned were the problems I
faced. Now I have written a little cleanup method, that removes the
bad characters from the input and finally I guess I have macroman
encoded files. But we will see, as soon as I try to open them on the
Mac. But now I am more or less satisfied, as only 3 obvious files
aren't converted correctly and the other 1000 files are.

Thanks for your hints, tips and so on. Good Night.

Oliver

-- 
Oliver Andrich <oliver.andrich at gmail.com> --- http://fitheach.de/



More information about the Python-list mailing list