[Mailman-Developers] NULL characters in e-mail.

Barry A. Warsaw bwarsaw@beopen.com
Thu, 21 Sep 2000 15:11:53 -0400 (EDT)


>>>>> "CVR" == Chuq Von Rospach <chuqui@plaidworks.com> writes:

    CVR> Because of this, I think Mailman needs to be sensitive to
    CVR> NULL characters and strip them from messages during
    CVR> processing. If they exist in a message, they need to be
    CVR> deleted.

I'm not sure what to do about this.  On the one hand, I believe
Mailman should be as transparent as possible so that what comes in is
what goes out.

On the other hand, perhaps it should try to be more friendly than
that.  On the third hand (okay, I'll start using my feet), going down
that road can be a slippery slope, and may mask problems in other
components in the chain (MTA, web server, MUA).  On the other foot,
how common is this problem?  It's hard for me to know because the
tools I use mean I have to deliberately stick that NUL in there so I
don't know how common accidental NULs are in the Real World.  And I
haven't become aware of any /practical/ problem with any of the
admittedly geek-loaded lists on python.org.

>>>>> "AM" == Andrew McNamara <andrewm@connect.com.au> writes:

    AM> Speaking for the Postfix MTA, it attempts to be as transparent
    AM> as possible. NULL's will be passed through in the body
    AM> (although headers are a different matter). I would suggest
    AM> this is not going to change.

    AM> If we were just talking about individual postings, I would say
    AM> "fair enough - if the sender includes crap, let it through",
    AM> but digests are a different matter, as the bogosity will
    AM> effect other people's postings (for similar reasons, list
    AM> admins often want to remove attachments when building a
    AM> digest).

I did some of my own testing here, with some hand-crafted messages
containing NUL bytes, e.g. in Python:

msg = '''\
Subject: testing NULs
To: testlist

testing hello!
There is a null byte coming up ==>\000<==
Did you see it?

-Barry
'''

This message gets sent through the system and arrives in my inbox
unmangled (except of course that the \000 gets printed as ^@ in the
message -- it's still NUL).  I'm using Postfix as my MTA and XEmacs/VM
as my MUA.  So mail delivery seems fine from my POV.

The archives are a different matter.  The first problem seems to be
that the indexes don't contain every message, although the mbox files
clearly do, and if you guess at the missing message's URL, you can
access it.

The individual message .html pages also seem complete, although what
you end up seeing depends on your browser.  NS appears to truncate the
<pre> tag contents (body of message) right at the NUL although it does
start displaying again at the <hr> tag.

On MSIE 5.5, although the NUL byte isn't displayed, it seems to show
as "==><==" with the rest of the message being displayed just fine.

So I'd have to say at this point that if there's a bug anywhere, it's
in Pipermail's generation of the indexes.  Oddly enough, the subject
index seems fine, but thread, date, and author seem hosed.  I'll
submit a bug report on SF for this.

Other than that, I think Mailman is doing what it should be doing.

-Barry