what encoding is this? How can I tell? How can I translate?

Carey Evans careye at spamcop.net
Tue Sep 25 08:31:58 EDT 2001


Skip Montanaro <skip at pobox.com> writes:

[...]

> I can infer that what looks like a
> capital "O" underneath a tilde in XEmacs (ordinal 213, hex 0xd5) is supposed
> to be an apostrophe, so I could do some hack filtering to convert this, but
> a quick scan for "d5" in the Python encodings directory suggests it is
> mac_latin2 (not sure what that is officially).

I would have picked it as being mac-roman, unless it's from somewhere
in Eastern Europe that Latin-2 covers.

The character would be U+2019, "RIGHT SINGLE QUOTATION MARK".  There's
no equivalent to this in latin-1, so the closest would probably be
U+0027, "APOSTROPHE", i.e. "'".

I suspect that it _is_ a bug in Mac OE if it doesn't specify the
character set as being different from the default, US-ASCII.  I can't
quote chapter and verse at the moment, though.

[...]

>     UnicodeError: Latin-1 encoding error: ordinal not in range(256)
> 
> which seemed odd, because the ordinal 213 character is the only character
> above ordinal 127.

213 in mac-roman is 0x2019 in Unicode, which isn't in range(256).

-- 
	 Carey Evans  http://home.clear.net.nz/pages/c.evans/

	You think you know... what's to come... what you are.



More information about the Python-list mailing list