[Web-SIG] parsing of urlencoded data and Unicode

Ian Bicking ianb at colorstudy.com
Mon Jul 28 22:40:42 CEST 2008


Manlio Perillo wrote:
> Hi.
> 
> In my WSGI framework:
> http://hg.mperillo.ath.cx/wsgix
> 
> I have, in the `http` module, the functions `parse_query_string` and
> `parse_simple_post_data`.
> 
> The first parse the query string and return a dictionary of strings, the
> latter parse the application/x-www-form-urlencoded client body and
> return a dictionary of strings and the charset used by the client for
> the unicode encoding.
> 
> 
> Now, I'm thinking if these two function should instead return Unicode
> strings instead of plain strings.
> 
> I think that Unicode strings should be returned, but I would like to
> know what other web frameworks do.
> 
> Django seems to convert to Unicode, but the Python standard library does 
> not (and I would like to know if changes are planned for Python 3.x).

WebOb decodes to request data to str, then lazily decodes to unicode 
based on the request encoding.  The request encoding is a bit fuzzy to 
calculate, which is part of why the decoding is lazy, so that the 
request encoding can be set or changed at any time.

-- 
Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org


More information about the Web-SIG mailing list