Inserting Unicode text with MySQLdb in Python 2.4-2.5?

Diez B. Roggisch deets at nospam.web.de
Wed Nov 18 09:38:06 EST 2009


Keith Hughitt wrote:

> Hi all,
> 
> I ran into a problem recently when trying to add support for earlier
> versions of Python (2.4 and 2.5) to some database related code which
> uses MySQLdb, and was wondering if anyone has any suggestions.
> 
> With later versions of Python (2.6), inserting Unicode is very simple,
> e.g.:
> 
>     # -*- coding: utf-8 -*-
>     ...
>     cursor.execute('''INSERT INTO `table` VALUES (0,
> 'Ångström'),...''')

You are aware that the coding-declaration only affects unicode-literals (the
ones like u"i'm unicode")? So the above insert-statement is *not* unicode,
it's a byte-string in whatever encoding your editor happens to save the
file in.

And that's point two: make sure your editor reads and writes the file in the
same encoding you specified in the comment in the beginning. 

> 
> When the same code is run on earlier versions, however, the results is
> either garbled text (e.g. "Ã or "?" instead of "Å" in Python 2.5), or
> an exception being thrown (Python 2.4):
> 
>     UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
> position 60: ordinal not in range(128)

Makes sense if the execute tries to encode to unicode first - as you didn't
give it a unicode-object.

> 
> So far I've tried a number of different things, including:
> 
>     1. Using Unicode strings (e.g. u"\u212B")
> 
>     2. Manually specifying the encoding using sys.setdefaultencoding
> ('utf-8')
> 
>     3. Manually enabling Unicode support in MySQLdb
> (use_unicode=False, charset = "utf8")

You *disabled* unicode here!!!!! Unicode is NOT utf-8!!! 

http://www.joelonsoftware.com/articles/Unicode.html


> 
> ...but no combination of any of the above resulted in proper database
> content.
> 
> To be certain that the issue was related to Python/MySQLdb and not
> MySQL itself, I manually inserted the text and it worked just fine.
> Furthermore, when working in a Python console, both print "Å" and
> print u"\u212B" display the correct output.
> 
> Any ideas? The versions of the MySQLdb adapter tested were 1.2.1
> (Python 2.4), and 1.2.2-10 (Python 2.5).

Try the above, and better yet provide self-contained examples that show the
behavior.

Diez



More information about the Python-list mailing list