Unicode problem

Erik Max Francis max at alcyone.com
Sat Jul 7 18:21:03 EDT 2007


pabloski at giochinternet.com wrote:

> Hi to all, I have a little problem with unicode handling under Python.
> 
> I have this code
> 
> s = u'A unicode string with this damn apostrophe \x2019'
> 
> outf = codecs.open('filename.txt', 'w', 'iso-8859-15')
> outf.write(s)
> 
> what I obtain is a UnicodeEncodeError that says me that character \x2019
> maps to undefined.
> 
> But the character \x2019 is the apostrophe and in the unicode table it has
> \x0027 as an equivalent, so the codecs should convert \x2019 to \x27 ( as
> defined in iso-8859-15 )....

U+2019 is RIGHT SINGLE QUOTATION MARK.  The APOSTROPHE (U+0027) is a 
cross-reference as a similar code point, but they're not the same thing.

Your problem is that ISO-8859-15 doesn't have the RIGHT SINGLE QUOTATION 
MARK, so you'll have to do the translation yourself if you want to turn 
it into a true APOSTROPHE.

-- 
Erik Max Francis && max at alcyone.com && http://www.alcyone.com/max/
  San Jose, CA, USA && 37 20 N 121 53 W && AIM, Y!M erikmaxfrancis
   She glanced at her watch ... It was 9:23.
    -- James Clavell



More information about the Python-list mailing list