[Mailman-Developers] Transformation of non-ascii characters - was: bounce problem w/ 2.1.11rc1 and GMail

Mark Sapiro mark at msapiro.net
Mon Jun 16 01:00:44 CEST 2008


Fil wrote:
>
>I found another minor issue today: my name contains a "=E8", and when I
>subscribe it is transformed into è in the database.


There may be an issue with your MemberAdaptor. With
OldStyleMemberships.py, accented characters are accepted and stored as
themselves in the usernames dictionary.


>Why not. But
>then it is presented to me everywhere on the web UI as è
>which displays the code and not the character. I can live with that
>but it's a bit ugly :-)


They are only displayed as &#ddd; in the web interface if the language
of the page is English. If the list's preferred language is e.g.,
French, they will display correctly in the admin interface, and if the
user's preferred language is e.g., French, they will display correctly
on the user's options page.

Granted, the conversion of è to è is ugly, but it would
be hard to change. The conversion of \xe8 to è is done by
Utils.uncanonstr() which will convert any character that can't be
encoded in the relevant character set (us-ascii for English) to the
&#ddd; form. Then, in order to protect against XSS attacks, values to
be displayed are passed through cgi.escape() which converts the & to
&.

The real fix might be for cgi.escape() to be smarter :-)

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan



More information about the Mailman-Developers mailing list