Unicode -> String problem

Michael Ströder michael at stroeder.com
Tue Jul 10 16:16:08 EDT 2001


Jay Parlar wrote:
> 
> > > Now, whenever I'm given HTML from IE's cache, it is unicode. There is 
> > > no doubt about that.
> >
> > Are you sure? Which encoding of Unicode? UTF-16, UTF-8, ...
> 
> Well, I know that it's Unicode for two reasons:
> 1) The developer
> of the module that generates the HTML from the cache told me so,

Frankly this is not a real reason...

> and 2) I do a check for
> UnicodeType in my own code. I don't think I could tell you
> which encoding it is though (not without knowing how to check
> for that within my code).

I wonder how you want to properly initialize an Unicode object
without knowing the encoding. You should have a closer look at the
cached data and not trust was anybody said. Especially you should
watch out for the HTTP header and <meta> tags.

Then you should get familiar with Unicode handling in Python and the
distinction of strings and Unicode objects by reading the fine
Python docs.

Ciao, Michael.



More information about the Python-list mailing list