[Web-SIG] WSGI for Python 3

Sun Jul 18 06:55:56 CEST 2010

On Fri, 2010-07-16 at 23:38 -0500, Ian Bicking wrote:
> On Fri, Jul 16, 2010 at 9:43 PM, Chris McDonough <chrism at plope.com>
> wrote:
>         
>         > Nah, not nearly that hard:
>         >
>         > path_info =
>         >
>         urllib.parse.unquote_to_bytes(environ['wsgi.raw_path_info']).decode('UTF-8')
>         >
>         > I don't see the problem?  If you want to distinguish %2f
>         from /, then
>         > you'll do it slightly differently, like:
>         >
>         > path_parts = [
>         >     urllib.parse.unquote_to_bytes(p).decode('UTF-8')
>         >     for p in environ['wsgi.raw_path_info'].split('/')]
>         >
>         > This second recipe is impossible to do currently with WSGI.
>         >
>         > So... before jumping to conclusions, what's the hard part
>         with using
>         > text?
>         
>         
>         It's extremely hard to swallow Python 3's current disregard
>         for the
>         primacy of bytes at I/O boundaries.  I'm trying, but I can't
>         help but
>         feel that the existence of an API like "unquote_to_bytes" is
>         more
>         symptom treatment than solution.  Of course something that
>         unquotes a
>         URL segment unquotes it into bytes; it's the only sane default
>         because
>         URL segments found in URLs on the internet are bytes.
> 
> Yes, URL quoted strings should decode to bytes, though arguably it is
> reasonable to also use the very reasonable UTF-8 default that
> urllib.parse.quote/unquote uses.  So it's really just a question of
> names, should be quote_to_string or quote_to_bytes that name.  Which
> honestly... whatever.

After some careful consideration, I realize I'm only able to offer stop
energy regarding the WSGI-as-text proposal, so I'll bow out of any
maillist conversation about it for now.

- C