Dealing with "funny" characters

Diez B. Roggisch deets at nospam.web.de
Mon Oct 22 06:25:54 EDT 2007


>> I doubt that indexing has anything to do with it whatsoever.
> 
>      Of course it does.  ORDER BY, LIKE, TRIM, and other SQL expressions
>      that
> do more than an equal comparison need to know the actual data
> representation. If you were to convert to UTF-8 or UCS-2 in the Python
> program and send the resulting byte string to MySQL, with MySQL thinking
> it was storing
> ASCII or a BLOB, many SQL functions won't work right.  A database is
> not a file system; a database looks at the data.

Garbage in, garbage out. But putting correctly encoded data into it won't
make any troubles, so "You don't want to convert data to UTF-8 before
putting it in a
database; the database indexing won't work." is utter nonsense. 

>> You confuse unicode with utf-8 here.
> ... pontification deleted

Pontication in contrast to what - your highly informative posts like this? 
http://mail.python.org/pipermail/python-list/2007-October/461375.html

I'm sure there are other daily routines your audience here can't wait to be
informed of in regular intervals.

Just because you write nonsense like 
"""
    First, tell MySQL, before you create your MySQL tables, that the tables
are
to be stored in Unicode:

        ALTER database yourdatabasename DEFAULT CHARACTER SET utf8;
 """

confusing unicode with an encoding of it doesn't make me pontificate. 

Diez



More information about the Python-list mailing list