Trouble with unicode

Brian Quinlan BrianQ at ActiveState.com
Mon May 14 17:31:11 EDT 2001


> >Hmmmm, are you sure that the characters are Unicode? They 
> look like latin-1
> >to me...
> they might be. They are e-mails and Latin-1 is a common page. 
> I guess I think 
> they are unicode because of their length.

Their length?
 
> >Anyway, I'm assuming that you want to generate ASCII text 
> based on a unicode
> >object and that you simply want to strip characters that are not
> >representable in ASCII. Let me know if these assumptions are 
> not true. If
> >they are, try this:
> >
> >>>> from codecs import lookup
> >>>> toASCII = lookup( 'ascii' )[0]
> >>>> toASCII( u'123\555' )
> >>>> toASCII( u'123\555', 'replace' )
> >('123?', 4)
> >
> >>> from codecs import lookup
> >>> toASCII = lookup ("ascii")[0]
> >>> toASCII(u"123\555")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> UnicodeError: ASCII encoding error: ordinal not in range(128) 
> is as far as I get. ... whether there is a bug in the BeOS 
> version of a 
> module? I don't think I made a typing mistake.

Oops :-) change 
toASCII(u"123\555") to 
toASCII(u"123\555", "replace")





More information about the Python-list mailing list