utf8 encoding problem

"Martin v. Löwis" martin at v.loewis.de
Sat Jan 24 03:26:40 EST 2004


Wichert Akkerman wrote:


>>P.S. According to HTML standard, with
>>application/x-www-form-urlencoded content type form data are
>>resricted to ASCII codes:
[...]
> Luckily that is not true, otherwise it would be completely impossible to
> have websites using non-ascii input. To be specific, the encoding used
> for HTML forms is determined by:  [algorithm omitted]

As Denis explains, it is true. See 17.13.4

application/x-www-form-urlencoded
... Non-alphanumeric characters are replaced by `%HH', a percent sign 
and two hexadecimal digits representing the ASCII code of the character.

So this form is restricted only to characters which have an ASCII code,
i.e. ASCII characters.

To have non-ASCII input, use multipart/form-data:

multipart/form-data
...
The content type "multipart/form-data" should be used for submitting 
forms that contain files, non-ASCII data, and binary data.

This reconfirms that you should use it for non-ASCII.

Regards,
Martin




More information about the Python-list mailing list