[Mailman-Developers] Please Allow Me To Introduce Myself...

Les Niles les@2pi.org
Thu, 7 Mar 2002 22:08:54 -0800


On Thu, 7 Mar 2002 00:20:26 -0500 barry@zope.com (Barry A. Warsaw) wrote:
>Here's the basic problem: there are lots of different use cases that
>fall under the rubric "filtering HTML".  Some people want it stripped,
>some want it transformed, do we preserve links, etc, etc.  It's hard
>to support everything everyone wants to do with HTML messages, /and/
>do it in a way that's intuitive and easy to configure through the web.
>I'm not saying it's impossible, but it's a lot of work, and MM2.1 has
>to get to beta RSN.  Plus, I think there are viable options (for the
>short term) without having this functionality in Mailman proper.
>E.g. demime.

Translation systems, whether speech recognition, natural language
translation, or reformatting the content of email, are
fundamentally imperfect.  That's why worrying about making an HTML
filter intuitive and easily configurable is important -- those
attributes are exactly what make a translation system usable and
useful.

I think the various types of "filtering HTML" fit into a couple of
broad categories.  One is translation: converting HTML content into
some other format in which most or all of the semantic content is
maintained.  The other is stripping: removing HTML sections
entirely.  The former is harder, more error-prone, and open to
incompatible interpretations of what constitutes the "right"
translation.  Stripping, OTOH, is simpler, more predictable, and
can usefully be applied to other MIME types that a list admin might
deem verbotten, but not nearly as powerful.  I'm suggesting that
both types of filtering are useful in MM's processing, but can and
probably should be presented as separate functions in the
configuration UI since their roles and characteristics are rather
different.  

  -les