Mailman archiver (inadvisable) translates Latin 1 characters

Erno Kuusela erno-news at erno.iki.fi
Thu Aug 9 17:13:06 EDT 2001


In article <Pine.LNX.4.21.0108092013100.1780-100000 at cens.ioc.ee>,
Pearu Peterson <pearu at cens.ioc.ee> writes:

| This message was originally sent to mailman users group but due to the
| delay there I'll try this newsgroup if someone could help me.

| ---------- Forwarded message ----------
| Date: Thu, 9 Aug 2001 17:46:28 +0200 (EET)
| From: Pearu Peterson <pearu at cens.ioc.ee>
| To: mailman-users at python.org
| Subject: Latin 1 characters


| Hi!

| I have noticed that latin 1 (or rather non US ascii) characters are mapped
| to 3-strings starting with = character when messages are shown in the
| Mailman archive. For example,

| õ -> =F5
| Õ -> =D5
| ä -> =E4
| Ä -> =C4
| ö -> =F6
| Ö -> =D6
| ü -> =FC
| Ü -> =DC
| \t -> =09

actually it is not mailman that does this, it is the sender. some mail
user agents escape all non-ascii characters this way, even though
practically all non-8-bit-clean mail software has disappeared from the
face of the earth by now.

the archive part of mailman does not know how to decode this
encoding (it is called "quoted-printable", also often called
"quoted-unreadable" :) ).

i guess mailman will have to be taught how to cope with this.  you
could also use some other web archiving software. i have good
experience with mhonarc. (it is nicer in some other respects too.)

the quopri module in the standard library may help you if you
decide to tackle this yourself.

  -- erno




More information about the Python-list mailing list