web app breakage with utf-8

Stefan Behnel stefan.behnel-n05pAM at web.de
Thu Jul 6 13:16:53 EDT 2006


elmo wrote:
> Hello, after two days of failed efforts and googling, I thought I had
> better seek advice or observations from the experts. I would be grateful
> for any input.
> 
> We have various small internal web applications that use utf-8 pages for
> storing, searching and retrieving user input. They have worked fine for
> years with non ASCII values, including Russian, Greek and lots of accented
> characters. They still do on an old version of python (2.2.1), and there's
> nothing in the code to decode/encode the input, it's *just worked*.
> 
> Recently however, while testing on a dev machine, I notice that any
> characters outside ASCII are causing SQL statement usage to break with
> UnicodeDecodeError exceptions with newer versions of python (2.3 and 2.4).
> There are a number of threads online, suggesting converting to unicode
> types, and similar themes, but I'm having no success. I am probably
> completely misunderstaning something fundamental. :-( 
> 
> My first question is did something change for normal byte stream usage
> making it more strict? I'm surprised there aren't more problems
> like this online.
> 
> Is there a correct way to handle text input from a <FORM> when the page is
> utf-8 and that input is going to be used in SQL statements? I've tried
> things like (with no success): 
> sql = u"select * from blah where col='%s'" % input

What about " ... % unicode(input, "UTF-8")" ?


> Doing sql = sql.decode('latin1') prior to execution prevents the
> some UnicodeDecodeError exceptions, but the data retrieved from the tables
> is no longer usable, causing breakage when being used to create the output
> for the browser.
> 
> I really am at a loss for what is going wrong, when everything works fine
> on crusty old 2.2.1. What are others doing for caputre, store, and output
> for web utf-8?

You didn't tell us what database you are using, which encoding your database
uses, which Python-DB interface library you deploy, and lots of other things
that might be helpful to solve your problem.

Stefan



More information about the Python-list mailing list