[Web-SIG] HTTP headers encoding

And Clover and-py at doxdesk.com
Thu Dec 3 20:11:54 CET 2009


Manlio Perillo wrote:

> I have written a simple WSGI application that asks authentication
> credentials

Ho ho! This is another area that is Completely Broken Everywhere. It's 
actually a similar situation to the cookies:

- Opera and Chrome send non-ASCII cookie characters in UTF-8.
- IE encodes using the system codepage (which can never be UTF-8),
   mangling any characters that don't fit in the codepage through the
   traditional Windows 'similar replacement character' scheme.
- Mozilla uses the low byte of each UTF-16 code point (so ISO-8859-1
   gets through but everything else is mangled)
- Safari uses ISO-8859-1, and refuses to send any cookie containing
   characters outside the 8859-1 repertoire.
- Konqueror uses ISO-8859-1, and replaces any non-8859-1 character
   with a question mark.

The HTTP standard has nothing to say about the encoding in use *inside* 
the base64-encoded Authorization byte-string token. It's anyone's guess, 
and every browser has guessed differently. (Safari here is at least 
slightly better than its behaviour with the cookies.)

 > (and I suspect that [IE] always use this encoding, instead of
 > iso-8859-1).

It will certainly never send ISO-8859-1, but what it does send is locale 
dependent. Type an e-acute in your username on a Western machine and 
it'll send one byte sequence; type the same thing on an Eastern European 
Windows install and you'll get something quite different.

> Firefox (Iceweasel 3.0.14, Linux Debian Squeeze) sends me a '\xac'

> I don't know where \xac come from

It's the low byte of UCS-2 codepoint U+20AC (EURO SIGN). Firefox simply 
discards the top 8 bits of each codepoint.

> Unfortunately I can not test with IE 7 and 8.

The behaviour has not changed.

 > This is really a mess.

Isn't it.

 > How is authorization username handled in common WSGI frameworks?

No-one supports non-ASCII characters in Authentication. Most web authors 
simply move to cookies instead.

-- 
And Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/



More information about the Web-SIG mailing list