[Mailman-Users] suppress duplicate when posting addressedtolistand its alias name

Mark Sapiro mark at msapiro.net
Wed Nov 7 03:02:27 CET 2012


Sahil Tandon wrote:

>On Tue, 2012-11-06 at 11:26:40 -0800, Mark Sapiro wrote:
>
>> In your case, the input to the hash on which runners are sliced
>> includes all the message headers and the listname so it is likely that
>> the "equivalent but different" listname messages will be in different
>> slices of the hash space.
>> 
>> This is not a concern if IncomingRunner is not sliced. It is also not a
>> concern with a disk based cache as long as buffers are flushed after
>> writing because IncomingRunner locks the list whose message is being
>> processed which should prevent race conditions between different
>> slices of IncomingRunner.
>
>Then, would it make sense (or be overkill) to have the handler populate
>a dict of key, value = message-id, timestamp?  And, store that dict in a
>pickle whose filename is derived from mlist.internal_name()?
>
>Obviously, this would result in a lot of pickles that are constantly
>opened, edited (and, periodically cleansed), and closed.  Is the
>performance cost/benefit prohibitive?


Whether the cost is prohibitive depends on how many messages per
minute, hour, day, etc you process through the list. I think it could
work. The 'in-memory dictionary would also work as long as you are
running with the default single qrunner per queue except for the rare
case where the duplicates are processed one on each side of a restart.

Note as an implementation for the file name (path) derived from the
list's internal_name, I would just use a fixed file name, e.g.,
message-ids.pck in the existing lists/internal_name()/ directory.


>I would also be relying on the
>fact that a handler is never concurrently called for the same list -- is
>that understanding accurate? -- which avoids the scenario in which we
>are trying to simultaneously manipulate the same pickle.


Yes, that is accurate. IncomingRunner locks the list before processing
the pipeline and doesn't unlock it until it's done, so processing of
the pipeline for a given message and list is complete before any other
runner can begin processing a message for that list.

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan



More information about the Mailman-Users mailing list