[Mailman-i18n] "Funny" characters in real names?

Barry A. Warsaw barry@zope.com
Fri, 13 Sep 2002 19:52:16 -0400


>>>>> "BG" =3D=3D Ben Gertzfield <che@debian.org> writes:

    BG> When submitting an HTML form, the character set used for the
    BG> submitted data is the same as the one specified in the HTML or
    BG> header of the original form's page.

I must be dense because I'm not quite seeing how this will work.

I visit the mass subscribe page and in the text box, I enter a funny
name, e.g. barry@python.org (Barry W=E2rsaw)
My list is conducted in English.

Now when I look at all the data submitted by the form, I don't see
anything immediately useful in either the cgi environment or in the
form data.  Here are some excerpts:

CONTENT_TYPE: multipart/form-data; boundary=3D-------------------------=
--527473093431726113359136092

Hmm, nothing there.

HTTP_ACCEPT_CHARSET: ISO-8859-1, utf-8;q=3D0.66, *;q=3D0.66

That doesn't really help us does it?  That's telling us what charsets
the browser will accept, right?  Not the same thing.

Now for the form data, I'll see a section like:

-----------------------------527473093431726113359136092
Content-Disposition: form-data; name=3D"subscribees"

warsaw@wooz.org (Barry W&#226;rsaw)
-----------------------------527473093431726113359136092

This doesn't tell me enough either does it?

So I'm at a loss as to where to such the information out of the form
data or the environment to figure out what charset the form was posted
in.

-Barry