[Python-Dev] Can the cgi module be made Unicode-aware?
Martin v. Loewis
martin@v.loewis.de
11 Apr 2002 18:26:55 +0200
Skip Montanaro <skip@pobox.com> writes:
> I did some reading before nodding off last night. The <form> tag takes an
> optional "accept-charset" attribute, which can be a list.
No, it doesn't - that's a proprietary extension. Or, maybe I'm missing
something: where did you find a statement that this is "official" in
any sense?
> As far as I can tell, the underlying data encoding of the form's data is
> generally going to be implicit.
Unfortunately. RFC 1867 specifies that browsers should use a
Content-Type in a multipart/form-data message, but none of the current
browsers does.
> Adding an "accept-charset" attribute to the <form> does appear to
> have some effect on Content-Type in some instances, but not in all.
It might depend on the browser, since it's proprietary.
> The cgi programmer can't rely on charset information coming from the browser
> and will need a way to tell the cgi module what the charset of the incoming
> data is. I think FieldStorage and MiniFieldStorage need optional charset
> parameters and I think the charset needs to be used from the Content-Type
> header, if present.
Of course, if you also have uploaded files, this cannot work: the file
data never follow the encoding - only the "text" fields do.
Regards,
Martin