[2.5.1] ShiftJIS to Unicode?

skip at pobox.com skip at pobox.com
Wed Nov 26 19:32:12 EST 2008


    Gilles> ======
    Gilles> m = try.search(the_page)
    Gilles> if m:
    Gilles>     #UnicodeEncodeError: 'charmap' codec can't encode characters in
    Gilles> position 49-55: character maps to <undefined>               
    Gilles>     title = m.group(1).decode('shift_jis').strip()
    Gilles> ======

    Gilles> Has someone successfully accessed Shift-JIS-encoded Japanese
    Gilles> contents with Python?

Have you verified that the characters in position 49-55 are actually
Shift-JIS characters?  In my experience problems decoding a source string in
any given character set are because of errors in the source, not errors in
Python.

OTOH, the characters in position 49-55 look like plain old ASCII to me.
Does Shift-JIS have ASCII as a proper subset?

-- 
Skip Montanaro - skip at pobox.com - http://smontanaro.dyndns.org/



More information about the Python-list mailing list