[ python-Bugs-1382096 ] MacRoman Encoding Bug (OHM vs. OMEGA)

SourceForge.net noreply at sourceforge.net
Fri Dec 16 03:22:35 CET 2005


Bugs item #1382096, was opened at 2005-12-16 02:22
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1382096&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Unicode
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Sean B. Palmer (seanbpalmer)
Assigned to: M.-A. Lemburg (lemburg)
Summary: MacRoman Encoding Bug (OHM vs. OMEGA)

Initial Comment:
The file encodings/mac_roman.py in Python 2.4.1
contains the following incorrect character definition
on line 96: 

        0x00bd: 0x2126, # OHM SIGN

This should read: 

        0x00bd: 0x03A9, # GREEK CAPITAL LETTER OMEGA

Presumably this bug occurred due to a misreading, given
that OHM and OMEGA having the same glyph. Evidence that
the OMEGA interpretation is correct: 

0xBD   0x03A9   # GREEK CAPITAL LETTER OMEGA
-http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/ROMAN.TXT

Further evidence can be found by Googling for MacRoman
tables. This bug means that, for example, the following
code gives a UnicodeEncodeError when it shouldn't do: 

>>> u'\u03a9'.encode('macroman')

For a workaround, I've been using the following code: 

>>> import codecs
>>> from encodings import mac_roman
>>> mac_roman.decoding_map[0xBD] = 0x03A9
>>> mac_roman.encoding_map =
codecs.make_encoding_map(mac_roman.decoding_map)

And then, to use the example above: 

>>> u'\u03a9'.encode('macroman')
'\xbd'
>>> 

Thanks,

-- 
Sean B. Palmer


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1382096&group_id=5470


More information about the Python-bugs-list mailing list