usage of <string>.encode('utf-8','xmlcharrefreplace')?

7stud bbxx789_05ss at yahoo.com
Tue Feb 19 01:54:44 EST 2008


To clarify a couple of points:

On Feb 18, 11:38 pm, 7stud <bbxx789_0... at yahoo.com> wrote:
> A unicode string looks like this:
>
> s = u'\u0041'
>
> but your string looks like this:
>
> s = 'he Company\xef\xbf\xbds ticker'
>
> Note that there is no 'u' in front of your string.  
>

That means your string is a regular string.


> If a python function requires a unicode string and a unicode string
> isn't provided..

For example: encode().


One last point: you can't display a unicode string.  The very act of
trying to print a unicode string causes it to be converted to a
regular string.  If you try to display a unicode string without
explicitly encode()'ing it first, i.e. converting it to a regular
string using a specified secret code--a so called 'codec', python will
implicitly attempt to convert the unicode string to a regular string
using the default codec, which is usually set to ascii.




More information about the Python-list mailing list