utf8 encoding problem
"Martin v. Löwis"
martin at v.loewis.de
Sun Jan 25 03:58:11 EST 2004
Andrew Clover wrote:
> Quite so, in theory. Of course in reality, no browser today includes a
> Content-Type header in the subparts of a multipart/form-data submission,
> so there's nowhere to specify an charset here either! argh.
Right. In this case, the algorithm Wichert quotes should apply.
I once tried to study why browsers won't send Content-Type headers.
Actually, they *do* send Content-Type headers, but omit the charset=
parameter. I submitted various bug reports, and the Mozilla people
replied that they tried to, and found that various CGI scripts would
break when confronted with the standards-conforming request, but
work when they get the deprecated form.
So it looks like this situation will extend indefinitely.
> multipart/form-data as implemented in current UAs is just as encoding-unaware
> as application/x-www-form-urlencoded, sadly. In practical terms it does not
> really matter much which is used.
Right - for practical terms, standards don't matter much. As this thread
shows, the form used *does* matter in practical terms though: Users
of application/x-www-form-urlencoded are now confronted with the
unescaping-then-decoding issue, which apparently is a challenge.
Regards,
Martin
More information about the Python-list
mailing list