[Mailman-Users] duplicate messages

Richard Barrett r.barrett at openinfo.demon.co.uk
Sun Apr 27 01:50:35 CEST 2003


At 18:55 26/04/2003, Brad Barnett wrote:
>On Sat, 26 Apr 2003 18:13:50 +0100
>Richard Barrett <r.barrett at openinfo.demon.co.uk> wrote:
>
> > At 16:05 26/04/2003, Brad Barnett wrote:
> > >On Sat, 26 Apr 2003 15:42:13 +0100
> > >Richard Barrett <r.barrett at openinfo.demon.co.uk> wrote:
> > >
> > > > At 15:22 26/04/2003, Brad Barnett wrote:
> > > >
> > > >
> > > > >Hey all,
> > > > >
> > > > >I've had a very bizarre situation occur with Mailman just last
> > > > >week. After one post to the list, Mailman started generating
> > > > >duplicate after duplicate of this same message.  It was caught in
> > > > >the process of sending the fourth message out.
> > > > >
> > > > >The list.mbox file shows 4 identical messages, even if using diff
> > > > >to check them against each other.  They all have the same message
> > > > >ID, the same send time, etc.  Each of these messages was sent to
> > > > >the over 10k users on this list, and I still don't know why.
> > > >
> > > > What version of Mailman are you using?
> > >
> > >
> > >2.0.13, on Redhat 8.0.
> > > >
> > > > Are you sure Mailman did  not have four copies of this message
> > > > delivered to it by your MTA?
> > > >
> > > > Have you checked your MTA's mail log to see if any trace of these
> > > > events are showing there?
> > >
> > >I've scoured the sendmail logs, but I can't see any indication of it.
> > >Of course, these would have to be _identical_ messages, since Mailman
> > >has four identical messages in its mbox.  Everything, even the receipt
> > >time is identical.
> > >

Just a guess but may be possible for sendmail to try and deliver the same 
message to MM multiple times. If it believes earlier attempts failed with 
some temporary error it might well retry the delivery to Mailman. Maybe 
load on the system caused sendmail to time out the delivery, and 
subsequently retry it, even though Mailman saw each attempt as being 
completed and dumped the message into its qfiles directory. My 
understanding is that sendmail sort of guarantees never to lose mail but 
may deliver it multiple times to achieve that. But I would expect something 
to show in the sendmail logs if delivery to MM failed for any reason.

> > >Is there any way to verify messages received / sent from the mailman
> > >logs?
> > >
> >
> > You should find an entry(ies) in MM's $prefix/logs/post and smtp logs as
> > it handles and sends out the message. The trouble is that both logs are
> > written as the qrunner takes the incoming message from the queue and
> > processes it rather than it being a log entry of when the message was
> > handed off from the MTA an put into the queue.
> >
>
>I was told that the post log doesn't directly reflect the posts to the
>list, but the messages it sends out?

Not quite. The post log is written as the message is taken for processing 
from the MM qfiles directory and before attempts are made to send it out to 
the list via SMTP. There may be a lag between when the mailman wrapper 
accepts delivery from the MTA and put the message into qfiles depending on 
the traffic being handled and system load.

>Apr 17 06:49:57 2003 (18609) post to ledetnews from sjledet at ledet.com,
>size=25203, 128 failures
>Apr 17 06:50:24 2003 (1724) post to ledetnews from sjledet at ledet.com,
>size=25203, 100 failures
>Apr 17 13:01:24 2003 (5671) post to ledetnews from
>ledetnews-request at lists.ledet.com, size=661, 1 failures
>Apr 17 13:03:30 2003 (5729) post to ledetnews from
>ledetnews-request at lists.ledet.com, size=670, 1 failures

If I recollect the number in brackets in each line of the post log entries, 
e.g. (18609), is the pid of the qrunner task handling the post. If I 
recollect this pid will also be in the MM $prefix/logs/smtp log entries as 
the posted message is sent out again to the MTA. The MM smtp log entries 
should show how many addresses the outgoing message was being sent to for 
that log entry.

The four log entries are intriguing. The first two are very close in time 
and the same size; same message ?? The second two seem to originate from 
one of the list's aliases and are being directed back to the list which 
seems a little odd.

>The weird thing is, the initial message was sent on the 15th.  Here it
>claims the 17th, and there are no other messages around this time (this
>list has one post or two a month).

This could be correct as the post log shows when the MM qrunner tried to 
start processing the message out to the list. When does sendmail claim it 
delivered it to MM; can you not match the message id in the sendmail log, 
presumable /var/log/mail?

>It also show failures, but it doesn't indicate if they were successful at
>all (there are 20k or so users on this list.. why doesn't it show how many
>succeeded?  why does it show two posts, when 4 were sent (or one if
>mailman duplicated the original message 4 times).  Two messages makes no
>sense at all, but one or four would....

I think MM' smtp log does show the number of outgoing addressees and with a 
large number of subscribers you may also see multiple log entries, with one 
for each batch handed off to the SMTP server; if I recollect the default is 
500 addresses per batch. But you may have tweaked that on your installation.

>I don't get it.  Looking at the SMTP logs is useless, with 80k entries for
>sends during the same time period.  All I can verify is that every user I
>pick on the list, was sent beween 3 and 4 copies of the same message ID
>with the same size (and users complained of such).  This makes sense since
>mailman was stopped during the 4th send process.

Do you mean the MM smtp log or the sendmail log when you refer to the 80k 
entries. I assume sendmail but have you set up you MM to send personalized 
mail in some fashion that MM is handing off one message at a time through 
SMTP.

> > If I remember correctly, with MM 2.0.13 no msg ids are not recorded in
> > those logs but usually the date/time helps correlate with the sendmail
> > logs, You could also see if anything correlates with anything in the MM
> > error log.
>
>Not really.. the times are off by days?!

Presumably you can see, in the sendmail log, the 'from' entry when the 
incoming message for the list is handed to sendmail and the 'to' entry when 
sendmail delivers the message to Mailman; with the matching message ids.

The times being off by days is maybe surprising. The sendmail and Mailman 
logs are being written on the same machine so presumably we can rule out 
problems with a mis-set clock. There may well be some lag between the time 
in the sendmail log that a message is handed to MM, when it will just go 
into qfiles, and the time in the MM post log when processing of the message 
from qfiles for distribution starts. That will all depend on the traffic 
the qrunner is having to handle.

> >
> > I cannot recollect experiencing any problems of this sort with using MM
> > 2.0.13 but I've long sinsce upgraded to MM 2.1.1.
> >
>
>I can't really justify an upgrade though, with the hope it fixes the
>problem.  This problem has to be debugged and verified before mailman is
>used again.  It may or may not be mailman's fault, but right now upgrading
>isn't an option, since we don't know where the bug sits.
>

I did not mean to suggest upgrading as a solution to this problem.

Let me know how you get on with this problem.

> > > >
> > > > >Mailman logging doesn't seem very helpful in determining this.  Is
> > > > >there any way I can discover just what happened, and is this a
> > > > >known bug in mailman?  I can provide any logging information
> > > > >needed...
> > > > >
> > > > >Thanks!
> > > >





More information about the Mailman-Users mailing list