Python 2.4 vs 2.5 - Unicode error

Wolfgang Rohdewald wolfgang at rohdewald.de
Thu Jan 22 01:10:50 EST 2009


On Mittwoch, 21. Januar 2009, Gaurav Veda wrote:
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position
> 4357: ordinal not in range(128)
> 
> Before sending the (insert) query to the mysql server, I do the
> following which I think should've taken care of this problem:
>  sqlStr = sqlStr.replace('\\', '\\\\')

you might consider using what mysql offers about unicode: save
all strings encoded as unicode. Might be more work now but I think
it would be a good investment in the future.

have a look at the mysql documentation for

mysql_real_escape_string() takes care of quoted chars. 

mysql_set_character_set() for setting the character set used
by the database connection

you can ensure that the web page is unicode by doing something
like

    charsetregex = re.compile(r'charset=(.*?)[\"&]')
    charsetmatch = charsetregex.search(page)
    if charsetmatch:
       charset=charsetmatch.group(1)
       utf8Text = unicode(page,charset)

-- 
Wolfgang



More information about the Python-list mailing list