Becoming Unicode Aware

Jim Hefferon jhefferon at smcvt.edu
Wed Oct 27 10:44:09 EDT 2004


fuzzyman at gmail.com (Michael Foord) wrote ...
> I'm trying to become 'unicode-aware'... *sigh*. What's that quote - 'a
> native speaker of ascii will never learn to speak unicode like a
> native'. The trouble is I think I've been a native speaker of latin-1
> without realising it.
It *is* odd, IMHO, that my database connector spits out strings-like
things that have 8-bit data so that when I
"".join(array_of_database_strings) them, I get a failure.  I've
learned to by-hand them into unicode strings, but it is odd. 
Something like a pair (encoding,string) seems more natural to me, but
probably I just don't get the issues.
> 
> My main problem with udnerstanding unicode is what to do with
> arbitrary text without an encoding specified. To the best of my
> knowledge the technical term for this situation is 'buggered'. E.g. I
> have a CGI guestbook script. Is the only way of knowing what encodign
> the user is typing in, to ask them ?
> 
I found this link
  https://bugzilla.mozilla.org/show_bug.cgi?id=18643#c12
useful.

Jim



More information about the Python-list mailing list