From asmodai at in-nomine.org Tue Oct 4 15:31:42 2005 From: asmodai at in-nomine.org (Jeroen Ruigrok/asmodai) Date: Tue, 4 Oct 2005 15:31:42 +0200 Subject: [Mailman-Developers] Report 1235567 - apply patches please? Message-ID: <20051004133142.GM64400@nexus.ninth-circle.org> Can one of the developers PLEASE apply the patches in SF report 1235567? Having a report open for almost 3 months now WITH patches is not really appreciative of your community. =\ Thanks, -- Jeroen Ruigrok van der Werven / asmodai Free Tibet! http://www.savetibet.org/ | http://www.andf.info/ http://www.tendra.org/ | http://www.in-nomine.org/ | catcher at in-nomine.org When we blindly adopt a religion, a political system, a literary dogma, we become automatons. We cease to grow... From asmodai at in-nomine.org Tue Oct 4 15:34:41 2005 From: asmodai at in-nomine.org (Jeroen Ruigrok/asmodai) Date: Tue, 4 Oct 2005 15:34:41 +0200 Subject: [Mailman-Developers] Report 1235567 - apply patches please? In-Reply-To: <20051004133142.GM64400@nexus.ninth-circle.org> References: <20051004133142.GM64400@nexus.ninth-circle.org> Message-ID: <20051004133441.GN64400@nexus.ninth-circle.org> -On [20051004 15:31], Jeroen Ruigrok/asmodai (asmodai at in-nomine.org) wrote: >Can one of the developers PLEASE apply the patches in SF report 1235567? OK, pass the dunce cap please. 1) sf.net never sent me a response email. 2) I failed to see tkikuchi closed it. Blegh. Disregard, carry on. -- Jeroen Ruigrok van der Werven / asmodai Free Tibet! http://www.savetibet.org/ | http://www.andf.info/ http://www.tendra.org/ | http://www.in-nomine.org/ | catcher at in-nomine.org How many cares one loses when one decides not to be something but to be someone. From msapiro at value.net Tue Oct 4 18:37:42 2005 From: msapiro at value.net (Mark Sapiro) Date: Tue, 4 Oct 2005 09:37:42 -0700 Subject: [Mailman-Developers] Link to the "Reply-To munging considered useful" essay Message-ID: Some time ago, the link to the "Reply-To munging considered useful" essay changed from http://www.metasystema.org/essays/reply-to-useful.mhtml to http://www.metasystema.net/essays/reply-to.mhtml This was fixed by Barry in CVS for mailman/mailman/Mailman/Gui/General.py on July 1, 2005, but only in the details for reply_goes_to_list. I just noticed the link is also in the details for reply_to_address and needs to be fixed there too. -- Mark Sapiro The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan From msapiro at value.net Tue Oct 4 19:22:45 2005 From: msapiro at value.net (Mark Sapiro) Date: Tue, 4 Oct 2005 10:22:45 -0700 Subject: [Mailman-Developers] Link to the "Reply-To munging considereduseful" essay In-Reply-To: Message-ID: Mark Sapiro wrote: > >This was fixed by Barry in CVS for >mailman/mailman/Mailman/Gui/General.py on July 1, 2005, but only in >the details for reply_goes_to_list. I just noticed the link is also in >the details for reply_to_address and needs to be fixed there too. Brad just reminded me that there is a FAQ article on this at http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq03.048.htp which links to both the "Reply-To" Munging Considered Harmful and the Reply-To Munging Considered Useful articles as well as providing other information. Perhaps the details for reply_goes_to_list and reply_to_address should just link to the FAQ article as this is likely to be a stable link and is easier to maintain than links in the code. -- Mark Sapiro The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan From brad at stop.mail-abuse.org Wed Oct 5 00:08:54 2005 From: brad at stop.mail-abuse.org (Brad Knowles) Date: Wed, 5 Oct 2005 00:08:54 +0200 Subject: [Mailman-Developers] Debugging lost messages? Message-ID: Folks, I've got another mailing list server installation (Mailman 2.1.5 and postfix 2.2-20040504), and I've just discovered that one of the lists has been broken for about a month and I'm having some problems figuring out how messages are being lost. The postfix logs are clearly showing the messages being sent to "post", but no evidence of them shows up anywhere in the Mailman logs. They're not in the post, error, or vette logs, and I can't figure out where else to be looking. I also can't figure out where I need to add some more debugging information to the log output. I'm stumped. Can someone give me some advice here? -- Brad Knowles, "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety." -- Benjamin Franklin (1706-1790), reply of the Pennsylvania Assembly to the Governor, November 11, 1755 SAGE member since 1995. See for more info. From msapiro at value.net Wed Oct 5 02:08:26 2005 From: msapiro at value.net (Mark Sapiro) Date: Tue, 4 Oct 2005 17:08:26 -0700 Subject: [Mailman-Developers] Debugging lost messages? In-Reply-To: Message-ID: Brad Knowles wrote: > > I've got another mailing list server installation (Mailman 2.1.5 >and postfix 2.2-20040504), and I've just discovered that one of the >lists has been broken for about a month and I'm having some problems >figuring out how messages are being lost. > > The postfix logs are clearly showing the messages being sent to >"post", but no evidence of them shows up anywhere in the Mailman >logs. They're not in the post, error, or vette logs, and I can't >figure out where else to be looking. I also can't figure out where I >need to add some more debugging information to the log output. > > > I'm stumped. Can someone give me some advice here? See FAQ article 3.14 (just kidding :-) Is archiving on? Are there members with delivery not disabled? If these are digest members is digestable 'yes'? Likewise, if there are regular members is nondigestable 'yes'? Since there are presumably other, working lists, much of the FAQ isn't relevant, but check for list locks in locks/ Also, if other lists are working, the wrapper and the scripts/post script are presumably working, at least assuming that the wrapper in the aliases pipe is the same one for this list as for the others that work. You could put a 'debug' in scripts/post to be sure, but all it does is put the message in qfiles/in. Check the queue or maybe all the queues, but when you say there's no evidence of the messages in Mailman, maybe you've already checked. Once the message gets to qfiles/in, processing continues with Mailman/Queue/IncomingRunner.py which basically directs the message through the handlers in Mailman/Handlers/ which are listed in the list's 'pipeline' attribute or the GLOBAL_PIPELINE. Does the list have a 'pipeline' attribute? If so, are certain critical delivery handlers such as 'ToDigest', 'ToArchive' and 'ToOutgoing' all there? Is there a lists//extend.py file? If so, what is it intended to do and does it work? It may be none of these things, but that's about all I can think of at the moment. If I couldn't find anything obvious in the above ideas, I'd try putting a debug logging statement conditional on mlist.internal_name() == listname in Mailman/Queue/IncomingRunner.py in the while loop in _dopipeline to see how far it gets. -- Mark Sapiro The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan From brad at stop.mail-abuse.org Wed Oct 5 19:37:57 2005 From: brad at stop.mail-abuse.org (Brad Knowles) Date: Wed, 5 Oct 2005 19:37:57 +0200 Subject: [Mailman-Developers] Debugging lost messages? In-Reply-To: References: Message-ID: At 5:08 PM -0700 2005-10-04, Mark Sapiro wrote: > Is archiving on? Yes. There's nothing in the archives for this list since the 15th of September, and so far as I can tell nothing has been sent out to any of the list members since then. > Are there members with delivery not disabled? If these > are digest members is digestable 'yes'? Likewise, if there are regular > members is nondigestable 'yes'? There are ten members of the list, none of which are in digest mode, and none are disabled. > Since there are presumably other, working lists, much of the FAQ isn't > relevant, but check for list locks in locks/ Nope, no locks. Of course, I had stopped and restarted Mailman, and sent a test message through (which worked), before I checked to see if there were any locks. > Also, if other lists are working, the wrapper and the scripts/post > script are presumably working, at least assuming that the wrapper in > the aliases pipe is the same one for this list as for the others that > work. The other lists are working fine, so far as I can tell. > You could put a 'debug' in scripts/post to be sure, but all it does is > put the message in qfiles/in. Check the queue or maybe all the queues, > but when you say there's no evidence of the messages in Mailman, maybe > you've already checked. I had checked all the queues before stopping and restarting Mailman, and they were all empty. > Once the message gets to qfiles/in, processing continues with > Mailman/Queue/IncomingRunner.py which basically directs the message > through the handlers in Mailman/Handlers/ which are listed in the > list's 'pipeline' attribute or the GLOBAL_PIPELINE. Does the list have > a 'pipeline' attribute? If so, are certain critical delivery handlers > such as 'ToDigest', 'ToArchive' and 'ToOutgoing' all there? I did not set up any list-specific pipeline, no. > Is there a lists//extend.py file? If so, what is it intended > to do and does it work? No, there is no extend.py file for any of these lists. The installation is pretty plain-jane. About the only thing we modified was some of the templates for auto responses, telling the sender that their message was being held, etc.... > It may be none of these things, but that's about all I can think of at > the moment. I certainly couldn't think of anything else. > If I couldn't find anything obvious in the above ideas, I'd try putting > a debug logging statement conditional on mlist.internal_name() == > listname in Mailman/Queue/IncomingRunner.py in the while loop in > _dopipeline to see how far it gets. I'll give that a shot. Thanks! -- Brad Knowles, "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety." -- Benjamin Franklin (1706-1790), reply of the Pennsylvania Assembly to the Governor, November 11, 1755 SAGE member since 1995. See for more info. From brad at stop.mail-abuse.org Thu Oct 6 18:39:47 2005 From: brad at stop.mail-abuse.org (Brad Knowles) Date: Thu, 6 Oct 2005 18:39:47 +0200 Subject: [Mailman-Developers] Debugging lost messages? In-Reply-To: References: Message-ID: At 5:08 PM -0700 2005-10-04, Mark Sapiro wrote: > If I couldn't find anything obvious in the above ideas, I'd try putting > a debug logging statement conditional on mlist.internal_name() == > listname in Mailman/Queue/IncomingRunner.py in the while loop in > _dopipeline to see how far it gets. I added a bit more debugging, and the important section looks like this: except Errors.DiscardMessage: # Throw the message away; we need do nothing else with it. syslog('vette', 'Message discarded, listname: %s, msgid: %s', listname, msg.get('message-id', 'n/a')) return 0 except Errors.HoldMessage: # Let the approval process take it from here. The message no # longer needs to be queued. return 0 except Errors.RejectMessage, e: mlist.BounceMessage(msg, msgdata, e) syslog('vette', 'Message bounced, listname: %s, msgid: %s', listname, msg.get('message-id', 'n/a')) return 0 So, now I'm seeing the listname in conjunction with the "message discarded" log entries, and I'm getting a completely new "message bounced" log entry which wasn't there before at all. But I'm not seeing any details as to why a message is being discarded or bouncing. I see two more messages that came in today to the list in question, and by matching message-ids and listnames between the Mailman "vette" log and the postfix syslog, I can tell that these two messages bounced. But not why. Maybe they were spam? And I have yet to see any "normal" messages come into this list so far, so I can't tell what may have been happening to them and why they weren't being posted. -- Brad Knowles, "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety." -- Benjamin Franklin (1706-1790), reply of the Pennsylvania Assembly to the Governor, November 11, 1755 SAGE member since 1995. See for more info. From barry at python.org Fri Oct 7 04:31:53 2005 From: barry at python.org (Barry Warsaw) Date: Thu, 06 Oct 2005 22:31:53 -0400 Subject: [Mailman-Developers] Debugging lost messages? In-Reply-To: References: Message-ID: <1128652313.28159.5.camel@geddy.wooz.org> On Thu, 2005-10-06 at 12:39, Brad Knowles wrote: > But I'm not seeing any details as to why a message is being > discarded or bouncing. I see two more messages that came in today to > the list in question, and by matching message-ids and listnames > between the Mailman "vette" log and the postfix syslog, I can tell > that these two messages bounced. But not why. Maybe they were spam? Maybe they're triggering one of your content (or other) filters? Have you got this list set up to discard non-members or something else of that nature? -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20051006/ed1184ef/attachment.pgp From brad at stop.mail-abuse.org Fri Oct 7 05:29:39 2005 From: brad at stop.mail-abuse.org (Brad Knowles) Date: Fri, 7 Oct 2005 05:29:39 +0200 Subject: [Mailman-Developers] Debugging lost messages? In-Reply-To: <1128652313.28159.5.camel@geddy.wooz.org> References: <1128652313.28159.5.camel@geddy.wooz.org> Message-ID: At 10:31 PM -0400 2005-10-06, Barry Warsaw wrote: > Maybe they're triggering one of your content (or other) filters? Have > you got this list set up to discard non-members or something else of > that nature? I thought of those. All non-member postings are supposed to be held for moderation, and there's nothing beyond the standard out-of-the-box "legacy" anti-spam filters in place for this list. I've got a lot of anti-spam filters in SpamAssassin that might also be causing these types of postings to be lost (reports from BitKeeper, which tend to run afoul of the "Chickenpox" rules), but then they shouldn't show up in the postfix log as having been delivered to the Mailman "post" process, and they certainly wouldn't trip any of the "legacy" spam filters. This is what I find so frustrating about debugging this particular process. As far as Mailman is concerned, you really can't get too much more plain-jane than what we're running. And yet, stuff has clearly been broken for about a month now, and I can't figure out why. But if you've got ideas on some places where I can put in some further debugging to see what's going on and why, I'll be glad to give that a shot. -- Brad Knowles, "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety." -- Benjamin Franklin (1706-1790), reply of the Pennsylvania Assembly to the Governor, November 11, 1755 SAGE member since 1995. See for more info. From j.e.vanbaal at uvt.nl Fri Oct 7 11:31:27 2005 From: j.e.vanbaal at uvt.nl (Joost van Baal) Date: Fri, 7 Oct 2005 11:31:27 +0200 Subject: [Mailman-Developers] Mailman and S/MIME: OpenSSL licensing vs GPGME Message-ID: <20051007093127.GB768@banach.uvt.nl> Hi, I am working on integrating PGP and S/MIME with Mailman (see http://non-gnu.uvt.nl/pub/mailman/ and my previous post http://mail.python.org/pipermail/mailman-developers/2005-March/017974.html). The PGP stuff works, I am now working with pyme (http://pyme.sourceforge.net/) and GPGME to get S/MIME stuff done. I've found out the hard way that GPGME is pretty rough on the edges and am considering moving to OpenSSL (e.g. using pyOpenSSL or M2Crypto). I would really like to get my patch used by a lot of people, and it would really rock if one day the patch could get shipped with the upstream Mailman distribution. Now, would using a Python OpenSSL library diminish my chances, e.g. because of licensing issues? And should I therefore stick with GPGME? Thanks for any insight. Bye, Joost -- Joost van Baal http://abramowitz.uvt.nl/ Tilburg University j.e.vanbaal at uvt.nl The Netherlands -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: Digital signature Url : http://mail.python.org/pipermail/mailman-developers/attachments/20051007/ece35e53/attachment.pgp From brad at stop.mail-abuse.org Mon Oct 10 11:18:00 2005 From: brad at stop.mail-abuse.org (Brad Knowles) Date: Mon, 10 Oct 2005 11:18:00 +0200 Subject: [Mailman-Developers] Debugging lost messages? In-Reply-To: References: <1128652313.28159.5.camel@geddy.wooz.org> Message-ID: At 5:29 AM +0200 2005-10-07, Brad Knowles wrote: > This is what I find so frustrating about debugging this particular > process. As far as Mailman is concerned, you really can't get too > much more plain-jane than what we're running. And yet, stuff has > clearly been broken for about a month now, and I can't figure out why. One thing I've found particularly frustrating is that I can now see messages being bounced which should not be (thanks to the additional debugging that I've put into place), but I can't figure out which handler is causing the messages to be bounced. The code in IncomingRunner.py currently looks like this: except Errors.RejectMessage, e: mlist.BounceMessage(msg, msgdata, e) syslog('vette', 'Message bounced, listname: %s, msgid: %s', listname, msg.get('message-id', 'n/a')) return 0 What I'd like to do is add the name of the handler somewhere in that line, but I'm not sure how to do that. I'm going to read up on programming in Python, but any advice or assistance that anyone can provide would be appreciated. Once I can track down the offending handler, I can put in some more debugging code into that particular routine, and try to get a better idea of why those messages are being inappropriately bounced. -- Brad Knowles, "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety." -- Benjamin Franklin (1706-1790), reply of the Pennsylvania Assembly to the Governor, November 11, 1755 SAGE member since 1995. See for more info. From brad at stop.mail-abuse.org Mon Oct 10 11:51:25 2005 From: brad at stop.mail-abuse.org (Brad Knowles) Date: Mon, 10 Oct 2005 11:51:25 +0200 Subject: [Mailman-Developers] Debugging lost messages? In-Reply-To: References: <1128652313.28159.5.camel@geddy.wooz.org> Message-ID: At 11:18 AM +0200 2005-10-10, Brad Knowles wrote: > What I'd like to do is add the name of the handler somewhere in > that line, but I'm not sure how to do that. I'm going to read up on > programming in Python, but any advice or assistance that anyone can > provide would be appreciated. Blargh. After all that Googling and reading, what I should have done was to look earlier in that same routine. The code was already there. Sigh.... Anyway, I should now have the "modname" being printed in the "vette" log, so that I should be able to figure out which handler is causing the inappropriate bounces. I'll let you know when I find out more. -- Brad Knowles, "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety." -- Benjamin Franklin (1706-1790), reply of the Pennsylvania Assembly to the Governor, November 11, 1755 SAGE member since 1995. See for more info. From brad at stop.mail-abuse.org Mon Oct 10 19:09:43 2005 From: brad at stop.mail-abuse.org (Brad Knowles) Date: Mon, 10 Oct 2005 19:09:43 +0200 Subject: [Mailman-Developers] Debugging lost messages? In-Reply-To: References: <1128652313.28159.5.camel@geddy.wooz.org> Message-ID: At 11:51 AM +0200 2005-10-10, Brad Knowles wrote: > Anyway, I should now have the "modname" being printed in the > "vette" log, so that I should be able to figure out which handler is > causing the inappropriate bounces. I'll let you know when I find out > more. Okay, I think I found the offending module. Hold.py will syslog to "vette", if the message is being held. But Moderate.py will not syslog anything -- it passes a held message to Hold.py, but handles rejections and discards itself. Moreover, Moderate.py uses two different methods of handling rejections and discards -- subscribers are handled in-line, while non-subscribers are handled through the do_reject() and do_discard() subroutines. And that's the only place the do_reject() and do_discard() subroutines are used. So, I can't even just drop in a bit of logging in the do_reject() and do_discard() subroutines, since they aren't both used for subscribers and non-subscribers alike. Sigh.... Can a real Python programmer suggest some changes that would create syslog messages for rejects and discards for both subscribers and non-subscribers alike, and maybe re-factor the code to re-use the do_discard() and do_reject() subroutines, or do I need to try to fumble around and fix these myself? -- Brad Knowles, "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety." -- Benjamin Franklin (1706-1790), reply of the Pennsylvania Assembly to the Governor, November 11, 1755 SAGE member since 1995. See for more info. From msapiro at value.net Mon Oct 10 21:32:05 2005 From: msapiro at value.net (Mark Sapiro) Date: Mon, 10 Oct 2005 12:32:05 -0700 Subject: [Mailman-Developers] Debugging lost messages? In-Reply-To: Message-ID: Brad Knowles wrote: > > Okay, I think I found the offending module. Hold.py will syslog >to "vette", if the message is being held. But Moderate.py will not >syslog anything -- it passes a held message to Hold.py, but handles >rejections and discards itself. > > Moreover, Moderate.py uses two different methods of handling >rejections and discards -- subscribers are handled in-line, while >non-subscribers are handled through the do_reject() and do_discard() >subroutines. And that's the only place the do_reject() and >do_discard() subroutines are used. In a prior post, you indicate that IncomingRunner was detecting a RejectMessage exception. You wrote: >(Your modified) code in IncomingRunner.py currently looks like this: > > except Errors.RejectMessage, e: > mlist.BounceMessage(msg, msgdata, e) > syslog('vette', 'Message bounced, listname: %s, msgid: %s', > listname, > msg.get('message-id', 'n/a')) > return 0 and presumably you were seeing that log message. Thus, we know it is a 'reject' and not a 'discard'. It looks like there are only 3 paths through Moderate.py that result in a reject. These are: Post is from a moderated member and the list's member_moderation_action is reject. Post is from a non-member in reject_these_nonmembers and not in accept or hold _these_nonmembers. Post is from a non-member not in *_these_nonmembers and generic_nonmember_action is reject. There are different values for the error message, e, that can distinguish the first case from the second two, but if this isn't enough, I would add some information to the logging statement above. For example: syslog('vette', 'Message bounced, listname: %s, \ msgid: %s Subject: %s, Sender: %s, Error: %s', listname, msg.get('message-id', 'n/a'), msg.get('subject', 'no subject'), msg.get_sender(), e) Also, in all these cases mlist.BounceMessage(msg, msgdata, e) should be attempting to send a reject message to the poster. There should at least be something in the smtp* logs about this. -- Mark Sapiro The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan From brad at stop.mail-abuse.org Tue Oct 11 01:22:05 2005 From: brad at stop.mail-abuse.org (Brad Knowles) Date: Tue, 11 Oct 2005 01:22:05 +0200 Subject: [Mailman-Developers] Debugging lost messages? In-Reply-To: References: Message-ID: At 12:32 PM -0700 2005-10-10, Mark Sapiro wrote: > In a prior post, you indicate that IncomingRunner was detecting a > RejectMessage exception. Correct. > and presumably you were seeing that log message. Thus, we know it is a > 'reject' and not a 'discard'. I am seeing some discards as well, but I've only seen them a couple of times, and I only saw them at the very beginning of this debugging process -- before I had added the additional information to the syslog output. I have seen other discards since, but only those two for that list. > Post is from a moderated member and the list's member_moderation_action > is reject. I confirmed that the list's member_moderation_action was to hold. > Post is from a non-member in reject_these_nonmembers and not in accept > or hold _these_nonmembers. There are only a couple of non-members listed in reject_these_nonmembers, and I don't think that either of them were involved. > Post is from a non-member not in *_these_nonmembers and > generic_nonmember_action is reject. For reasons I cannot comprehend, the generic_nonmember_action was indeed set to reject, which it should not have been. I know that it used to be set to hold, because I was getting lots of notices about this message from a non-member being held, or that message, etc.... As they came up, I added them to the whitelist, so they wouldn't have to go through the moderation process again. I know I didn't change the setting for this list, so one of the other people on the project (and with access to the site admin password) must have. > There are different values for the error message, e, that can > distinguish the first case from the second two, but if this isn't > enough, I would add some information to the logging statement above. When I was trying to print the error message "e", I was getting errors sent to the console about bad formatting or somesuch. I had to remove that in order to get Mailman to run correctly. > For example: > > syslog('vette', 'Message bounced, listname: %s, \ > msgid: %s Subject: %s, Sender: %s, Error: %s', > listname, > msg.get('message-id', 'n/a'), > msg.get('subject', 'no subject'), > msg.get_sender(), > e) This is definitely the sort of error message that I'd like to see get logged in Moderate.py. Only I'd like to see this get logged for all three code paths, and without having to have the same code replicated three times. At the very least, if someone changes the generic_nonmember_action on me again, with this much information being syslogged, I should hopefully be able to detect this situation much earlier and be able to correct it much faster. > Also, in all these cases mlist.BounceMessage(msg, msgdata, e) should be > attempting to send a reject message to the poster. There should at > least be something in the smtp* logs about this. Ahh, that's another good idea. Thanks again! -- Brad Knowles, "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety." -- Benjamin Franklin (1706-1790), reply of the Pennsylvania Assembly to the Governor, November 11, 1755 SAGE member since 1995. See for more info. From barry at python.org Wed Oct 12 19:16:22 2005 From: barry at python.org (Barry Warsaw) Date: Wed, 12 Oct 2005 13:16:22 -0400 Subject: [Mailman-Developers] The Mailman FAQ wizard (temporarily unavailable) Message-ID: <1129137382.25721.9.camel@geddy.wooz.org> The Python.org administrators have moved the website to our spiffy new server, however the Mailman FAQ wizard hasn't yet been moved. It should be shortly. Please try not to make any changes to the FAQ wizard for the next day or so. Currently, until your DNS updates, you'll be hitting the wizard on the old machine -- at least until it gets disabled. When the database gets migrated, your changes may get lost. Then you may not have access to the new one until your DNS gets updated. I'll send out another message when the FAQ wizard has been successfully migrated. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20051012/b1bb0cd8/attachment.pgp From brad at python.org Wed Oct 12 20:05:46 2005 From: brad at python.org (Brad Knowles) Date: Wed, 12 Oct 2005 20:05:46 +0200 Subject: [Mailman-Developers] Mailman FAQ Wizard going down temporarily... Message-ID: Folks, We're in the process of switching www.python.org to a new machine, and it may take a while for the DNS changes to propagate. In the meanwhile, we're disallowing updates to the Mailman FAQ Wizard on the old machine, so that we can make sure we've got the absolute latest material on the new system. Please bear with us. I believe that the new machine will be a huge improvement over the old creaky system, but it may take us a day or two to get all the bugs worked out. If you run into any unusual problems, please let us know. -- Brad Knowles Python.org Postmaster Team From barry at python.org Wed Oct 12 20:08:36 2005 From: barry at python.org (Barry Warsaw) Date: Wed, 12 Oct 2005 14:08:36 -0400 Subject: [Mailman-Developers] The Mailman FAQ wizard (temporarily unavailable) In-Reply-To: <1129137382.25721.9.camel@geddy.wooz.org> References: <1129137382.25721.9.camel@geddy.wooz.org> Message-ID: <1129140516.8399.3.camel@geddy.wooz.org> On Wed, 2005-10-12 at 13:16, Barry Warsaw wrote: > The Python.org administrators have moved the website to our spiffy new > server, however the Mailman FAQ wizard hasn't yet been moved. It should > be shortly. > > Please try not to make any changes to the FAQ wizard for the next day or > so. Currently, until your DNS updates, you'll be hitting the wizard on > the old machine -- at least until it gets disabled. When the database > gets migrated, your changes may get lost. Then you may not have access > to the new one until your DNS gets updated. > > I'll send out another message when the FAQ wizard has been successfully > migrated. Okay, if you get a permission denied, you're talking to the old server. Once your DNS gets updated, you'll get the new server on which the FAQ wizard is now enabled. Please feel free to use it as before. Thanks, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20051012/da2ba8c0/attachment.pgp From barry at python.org Wed Oct 12 20:08:36 2005 From: barry at python.org (Barry Warsaw) Date: Wed, 12 Oct 2005 14:08:36 -0400 Subject: [Mailman-Developers] [Mailman-Users] The Mailman FAQ wizard (temporarily unavailable) In-Reply-To: <1129137382.25721.9.camel@geddy.wooz.org> References: <1129137382.25721.9.camel@geddy.wooz.org> Message-ID: <1129140516.8399.3.camel@geddy.wooz.org> On Wed, 2005-10-12 at 13:16, Barry Warsaw wrote: > The Python.org administrators have moved the website to our spiffy new > server, however the Mailman FAQ wizard hasn't yet been moved. It should > be shortly. > > Please try not to make any changes to the FAQ wizard for the next day or > so. Currently, until your DNS updates, you'll be hitting the wizard on > the old machine -- at least until it gets disabled. When the database > gets migrated, your changes may get lost. Then you may not have access > to the new one until your DNS gets updated. > > I'll send out another message when the FAQ wizard has been successfully > migrated. Okay, if you get a permission denied, you're talking to the old server. Once your DNS gets updated, you'll get the new server on which the FAQ wizard is now enabled. Please feel free to use it as before. Thanks, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20051012/da2ba8c0/attachment-0001.pgp -------------- next part -------------- ------------------------------------------------------ Mailman-Users mailing list Mailman-Users at python.org http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/parturi%40bairesweb.com Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp From jdennis at redhat.com Wed Oct 12 20:22:01 2005 From: jdennis at redhat.com (John Dennis) Date: Wed, 12 Oct 2005 14:22:01 -0400 Subject: [Mailman-Developers] structural problem with MemberAdapter In-Reply-To: <1127602236.14497.31.camel@geddy.wooz.org> References: <1127340469.1827.97.camel@finch.boston.redhat.com> <1127602236.14497.31.camel@geddy.wooz.org> Message-ID: <1129141321.2786.50.camel@finch.boston.redhat.com> Thank you Barry for the thoughtful response. Sorry for the delayed response on my end. On Sat, 2005-09-24 at 18:50 -0400, Barry Warsaw wrote: > > [ implementation issues snipped for brevity ] > > Now after having fully implemented the adapter interface I have to admit > > I don't really understand what its buying you over the existing > > OldMemberAdapter. My initial thought was to capitalize on existing user > > information at a site, but given the way mailman data is structured (a > > set of lists, each list may contain both local and unknown foreign > > users, and user properties are per list) then there seems to be little > > value in intermingling site user data and mailman list data. > > > > Also, it was not clear how an adapter might implement just a subset of > > the methods via inheritance, I suppose it would copy the function > > pointers from the mlist._memberadapter into its own methods before > > resetting mlist._memberadapter to itself. yes/no? > > I'm not sure I understand exactly what you're trying to do here. The > intention was that using a different adapter was an all-or-nothing > proposition. I.e. if you were going to use an LDAP or MySQL adapter, > then you wouldn't use any of the config.pck based OldMemberAdapter. I > have my doubts that Mailman would be able to intermix the two. Here was my thinking: Without adding new list creation/deletion hooks previously discussed above but elided for brevity alternate MemberAdapters are of marginal value. After fully implementing the MemberAdapter interface with an alternate backend I don't see any particular advantage over the existing python pickle model. The seductive enticement of a MemberAdapter would seem to be for it to integrate with an organizations existing user database. But because mailing list members can be both local (member of the organization) or foreign (external to the organization) a MemberAdapter whose view is only of local users is restrictive. This is why I thought perhaps the per list extend.py was invented, because certain lists might be "local" only. MemberAdaptor wants to organize users as email addresses in a list. As far as I can tell the same user can belong to several lists, the user information is unique to the list, its not shared between lists (i.e. a list member does not point to a member record). This model also permits the same user having unique preferences per list and to use different email addresses per list. The data organization tends to be opposite of how organizations model their user data whereby one record exists per user sharing as much common data about that user as is possible. Trying to map on data model onto the other rapidly becomes awkward and does not address the issue of foreign list members. What did seem to make sense to me was a hybrid model where some per user data (e.g. fullname, password, email address) and list membership was kept in the organizations database. But mailman specific list information (e.g. bounce info, digest, flags, etc.) are kept in mailman's database (e.g. pickle). The interface to MemberAdaptor would be a object inheritance model, the MemberAdaptor is a subclass of OldMembershipAdaptor. This would allow one to pick which methods to override and then delegate to the superclass those it did not want to handle. For example a subclassed MemberAdaptor might choose to implement getMemberName(), authenticateMember(), etc. but leave handling of bounce info, mailman flags, etc. to the parent adaptor. I did make a go at implementing this "inherited model" and it seems to work just fine (albeit very limited testing). What I did was to copy the function pointers from the existing member adaptor when my adaptor was instantiated. For any function I wanted to delegate to the parent I just called the saved function pointer, otherwise I called my own routine. This was a nice mix and match solution. However, for what its worth after having done all this work (a full LDAP member adaptor, and then the subclassed version) we've decided to take an entirely different direction and we no longer have need for these pieces of code. I would like to contribute what I've done as a patch on the SF site. The only problem is its a 95% solution, the list creation/deletion hooks are the missing 5% and I doubt I'll finish that work. I wonder if 95% is useful to people. -- John Dennis From brad at stop.mail-abuse.org Thu Oct 13 12:08:49 2005 From: brad at stop.mail-abuse.org (Brad Knowles) Date: Thu, 13 Oct 2005 12:08:49 +0200 Subject: [Mailman-Developers] [Mailman-Users] including other fields in Mailman membership info In-Reply-To: <20051013022102.GC25295@shogun2.Stanford.EDU> References: <20051013022102.GC25295@shogun2.Stanford.EDU> Message-ID: At 7:21 PM -0700 2005-10-12, Jon Dugan wrote: > Is there a way to get mailman to have more than just the name and > email fields? While I recognize this would complicate things -- I'd > really like to leverage the great interface, password mailing, > confirmation -- etc. all in a broader membership database. Mailman does not have a membership database. There are unsupported third-party patches to provide a database "member adapter", but that's not the same thing. > Separate from the above, is there a way to make a call within "Full > Personalization" to extract a custom piece of data to place into an > email? > > For example, if I do have to store the comapny name for each member in > an external database, is there a way to get it into each email when > sending? See . -- Brad Knowles, "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety." -- Benjamin Franklin (1706-1790), reply of the Pennsylvania Assembly to the Governor, November 11, 1755 SAGE member since 1995. See for more info. From jag at fsf.org Fri Oct 14 18:25:25 2005 From: jag at fsf.org (Joshua Ginsberg) Date: Fri, 14 Oct 2005 12:25:25 -0400 Subject: [Mailman-Developers] XMLRPC Patch Message-ID: <1129307125.6026.10.camel@localhost.localdomain> I'm developing a patch to add an XMLRPC-based management interface to Mailman. Would this be something that you would be interested in trying to incorporate in the 2.1.x branch? Thanks! -jag -- Joshua Ginsberg Free Software Foundation - Senior Systems Administrator -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20051014/eed8fdc8/attachment.pgp From dragonstrider at gmail.com Fri Oct 14 21:56:53 2005 From: dragonstrider at gmail.com (Joseph Tate) Date: Fri, 14 Oct 2005 15:56:53 -0400 Subject: [Mailman-Developers] XMLRPC Patch In-Reply-To: <1129307125.6026.10.camel@localhost.localdomain> References: <1129307125.6026.10.camel@localhost.localdomain> Message-ID: Before you get too far down this road, I'd suggest looking at bug #1244799 (http://sourceforge.net/tracker/index.php?func=detail&aid=1244799&group_id=103&atid=300103). This was created for 2.1.6. Not all the functionality is there, but it should be enough to do much of what you can do from the command line or CGI interfaces. On 10/14/05, Joshua Ginsberg wrote: > I'm developing a patch to add an XMLRPC-based management interface to > Mailman. Would this be something that you would be interested in trying > to incorporate in the 2.1.x branch? Thanks! > > -jag > > -- > Joshua Ginsberg > Free Software Foundation - Senior Systems Administrator > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.1 (GNU/Linux) > > iD8DBQBDT9v1Z86JTdscITQRAvjcAJ9ySoi+HuRJh+7He23MioOOKVELNACfWODU > me+b/7jZudTLY09llKhKukY= > =Q79X > -----END PGP SIGNATURE----- > > > _______________________________________________ > Mailman-Developers mailing list > Mailman-Developers at python.org > http://mail.python.org/mailman/listinfo/mailman-developers > Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py > Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ > Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/dragonstrider%40gmail.com > > Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp > > -- Joseph Tate Personal e-mail: jtate AT dragonstrider DOT com Web: http://www.dragonstrider.com From jag at fsf.org Fri Oct 14 22:21:51 2005 From: jag at fsf.org (Joshua Ginsberg) Date: Fri, 14 Oct 2005 16:21:51 -0400 Subject: [Mailman-Developers] XMLRPC Patch In-Reply-To: References: <1129307125.6026.10.camel@localhost.localdomain> Message-ID: <1129321311.12963.19.camel@localhost.localdomain> Sonfabitch. :-) I'll attach my patch for comparison's sake. -jag On Fri, 2005-10-14 at 15:56 -0400, Joseph Tate wrote: > Before you get too far down this road, I'd suggest looking at bug > #1244799 (http://sourceforge.net/tracker/index.php?func=detail&aid=1244799&group_id=103&atid=300103). > > This was created for 2.1.6. Not all the functionality is there, but > it should be enough to do much of what you can do from the command > line or CGI interfaces. > > On 10/14/05, Joshua Ginsberg wrote: > > I'm developing a patch to add an XMLRPC-based management interface to > > Mailman. Would this be something that you would be interested in trying > > to incorporate in the 2.1.x branch? Thanks! > > > > -jag > > > > -- > > Joshua Ginsberg > > Free Software Foundation - Senior Systems Administrator > > > > > > -----BEGIN PGP SIGNATURE----- > > Version: GnuPG v1.4.1 (GNU/Linux) > > > > iD8DBQBDT9v1Z86JTdscITQRAvjcAJ9ySoi+HuRJh+7He23MioOOKVELNACfWODU > > me+b/7jZudTLY09llKhKukY= > > =Q79X > > -----END PGP SIGNATURE----- > > > > > > _______________________________________________ > > Mailman-Developers mailing list > > Mailman-Developers at python.org > > http://mail.python.org/mailman/listinfo/mailman-developers > > Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py > > Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ > > Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/dragonstrider%40gmail.com > > > > Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp > > > > > > > -- > Joseph Tate > Personal e-mail: jtate AT dragonstrider DOT com > Web: http://www.dragonstrider.com > -- Joshua Ginsberg Free Software Foundation - Senior Systems Administrator -------------- next part -------------- A non-text attachment was scrubbed... Name: mailman-xmlrpc.20051014.diff Type: text/x-patch Size: 15777 bytes Desc: not available Url : http://mail.python.org/pipermail/mailman-developers/attachments/20051014/f02740c4/mailman-xmlrpc.20051014-0001.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20051014/f02740c4/attachment-0001.pgp From jag at fsf.org Fri Oct 14 22:26:06 2005 From: jag at fsf.org (Joshua Ginsberg) Date: Fri, 14 Oct 2005 16:26:06 -0400 Subject: [Mailman-Developers] XMLRPC Patch In-Reply-To: <1129321311.12963.19.camel@localhost.localdomain> References: <1129307125.6026.10.camel@localhost.localdomain> <1129321311.12963.19.camel@localhost.localdomain> Message-ID: <1129321566.12963.21.camel@localhost.localdomain> Dammit -- wrong version. :-) Sorry. Try this one. -jag -- Joshua Ginsberg Free Software Foundation - Senior Systems Administrator -------------- next part -------------- A non-text attachment was scrubbed... Name: mailman-xmlrpc.20051014.diff Type: text/x-patch Size: 14863 bytes Desc: not available Url : http://mail.python.org/pipermail/mailman-developers/attachments/20051014/33f31665/mailman-xmlrpc.20051014.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20051014/33f31665/attachment.pgp From barry at python.org Sat Oct 15 02:12:34 2005 From: barry at python.org (Barry Warsaw) Date: Fri, 14 Oct 2005 20:12:34 -0400 Subject: [Mailman-Developers] XMLRPC Patch In-Reply-To: <1129307125.6026.10.camel@localhost.localdomain> References: <1129307125.6026.10.camel@localhost.localdomain> Message-ID: <1129335154.32365.56.camel@geddy.wooz.org> On Fri, 2005-10-14 at 12:25, Joshua Ginsberg wrote: > I'm developing a patch to add an XMLRPC-based management interface to > Mailman. Would this be something that you would be interested in trying > to incorporate in the 2.1.x branch? Thanks! Josh, thanks very much for this patch (and thanks for Joseph's previous patch). Now that Tokio has whipped the trunk into shape, I think this is something that should be slated for Mailman 2.2 instead of 2.1.x. Since there is more than one XMLRPC patch out there, I'd like for there to be some convergence before we apply the patches. Would you and/or Joseph be able to review the patches and lead an effort to provide a single patch against the trunk? It would also be great if you could add documentation to the existing texinfo files. If anybody else is interested in XMLRPC interface to Mailman, this is the place to discuss it! I may not pay strict attention to the thread until there's consensus in the community about what you'd like to see. But I'll try to answer any questions that come up. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20051014/9fde5b60/attachment.pgp From mburton at jo.birdsense.com Sun Oct 16 14:07:15 2005 From: mburton at jo.birdsense.com (Mike Burton) Date: Sun, 16 Oct 2005 05:07:15 -0700 Subject: [Mailman-Developers] MM3? Message-ID: <1129464435.6891.12.camel@jo.birdsense.com> Hi folks, I have been watching with anticipation, but it's been a while since I have heard anything on MM3. Any noteworthy updates? Take care, Mike From barry at python.org Sat Oct 22 19:56:29 2005 From: barry at python.org (Barry Warsaw) Date: Sat, 22 Oct 2005 13:56:29 -0400 Subject: [Mailman-Developers] MM3? In-Reply-To: <1129464435.6891.12.camel@jo.birdsense.com> References: <1129464435.6891.12.camel@jo.birdsense.com> Message-ID: <1130003789.11004.75.camel@geddy.wooz.org> On Sun, 2005-10-16 at 08:07, Mike Burton wrote: > I have been watching with anticipation, but it's been a while since I > have heard anything on MM3. Any noteworthy updates? Sadly, no. My job has been incredible time consuming so I just haven't had much time to work on MM3. If things go well, I have some plans for jump starting development that I'd like to share with the group around the December time frame. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20051022/9ef01aac/attachment.pgp From mburton at jo.birdsense.com Sun Oct 23 01:51:45 2005 From: mburton at jo.birdsense.com (Mike Burton) Date: Sat, 22 Oct 2005 16:51:45 -0700 Subject: [Mailman-Developers] MM3? In-Reply-To: <1130003789.11004.75.camel@geddy.wooz.org> References: <1129464435.6891.12.camel@jo.birdsense.com> <1130003789.11004.75.camel@geddy.wooz.org> Message-ID: <1130025105.7174.17.camel@jo.birdsense.com> Thanks for taking the time to respond, Barry. I certainly understand the issue of jobs taking excessive amounts of time and I empathize with you on that one. I look forward to you releasing your plans and will help as I can with anything to be of assistance. Take care, Mike On Sat, 2005-10-22 at 13:56 -0400, Barry Warsaw wrote: > On Sun, 2005-10-16 at 08:07, Mike Burton wrote: > > > I have been watching with anticipation, but it's been a while since I > > have heard anything on MM3. Any noteworthy updates? > > Sadly, no. My job has been incredible time consuming so I just haven't > had much time to work on MM3. If things go well, I have some plans for > jump starting development that I'd like to share with the group around > the December time frame. > > Cheers, > -Barry > From linux at comjet.com Sun Oct 23 08:32:16 2005 From: linux at comjet.com (Larry Howe) Date: Sun, 23 Oct 2005 02:32:16 -0400 Subject: [Mailman-Developers] Manage many subscriptions with one password Message-ID: <200510230232.16236.linux@comjet.com> Hello All, I am looking to add a feature to mailman and would appreciate some advice on how and where to get started. I run a MM server where users can subscribe to up to 8 different lists, and this number will grow. It would be nice for them not to have to deal with that many subs. I see discussion of a centralized user database, which is great. Correct me if I'm wrong, but that is an MM3 feature, and MM3 is a ways off? As an alternate, I was considering something like this: - when the user authenticates to his options page, there is a section there listing all possible mailing lists, and showing to which ones the user is currently subscribed. - user can simply check off the ones he wants and hit submit, and he will be subscribed / unsubscribed to the appropriate lists. - the button for List my Other Subscriptions is starting down this path. The differences would be: - list unsubscribed as well as subscribed lists - allow sub/unsub via this page - eventually, I could see adding a "group" field to the mailing lists so that the user would see only those lists in the current group. If I attempt this, is the project interested? If so, which branch? Is only MM3 being developed right now? I assume the centralized user database would obsolete what I'm proposing here. If this is a sane thing to try, any technical / implementation tips welcome. Larry Howe From adrian at whatifnet.com Mon Oct 24 19:56:22 2005 From: adrian at whatifnet.com (Adrian Wells) Date: Mon, 24 Oct 2005 13:56:22 -0400 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 Message-ID: Hello all. I have been working with Kev Green's MysqlMemberships.py Adaptor version 1.61 (SourceForge.net Mailman patch ID 839386) and Mailman 2.1.6 with Python 2.3.4. So far the adaptor has worked fairly well (with a minor patch to deal with one of the change made in Mailman 2.1.4, I believe, and the addition of a test to prevent the creation of unnecessary tables in the MySQL database - so this isn't a completely plain 1.61 version). However, there appears to be a bug/compatibility issue with this adapter concerning setting bounce information for list members. This process appears to be handled by a function called setBounceInfo in the adaptor. As it works now, setBounceInfo only successfully sets initial values in the MySQL database but fails to update subsequent bounce information. The bounce log seems to support these findings. It will report that the score was set to 2.0 but the MySQL database will still show a score of 1.0 for a bouncing list member... as a result members cannot receive bounce scores higher than 1.0! I am not proficient in Python and don't completely understand how Mailman operates so I'm interested in finding some help to understand how information generated by registerBounce in Bouncer.py is supposed to reach setBounceInfo in MysqlMemberships.py. Even a general understanding of how bounce information is processed in Mailman would be helpful for investigating this. Thank you, -Adrian From a.somerville at qut.edu.au Tue Oct 25 01:52:37 2005 From: a.somerville at qut.edu.au (AE Somerville) Date: Tue, 25 Oct 2005 09:52:37 +1000 Subject: [Mailman-Developers] Issues with archiving directory and OS limitations Message-ID: <200510242352.DVO04821@mail-router01-eth0.qut.edu.au> Hello, I have recently come across a problem that prevents the creation of any new lists for our site. Problem manifests as an inability of the create list process being able to make the archiving directories. The number appears to be when the directory count approaches 32,000 separate directories. How did it happen? We have close to 15,000+ lists but the archive directories houses two directories per list normally: /var/mailman/archives/private/ /var/mailman/archives/private/.mbox So the number of directories are essentially doubled and then Linux has trouble with having any more. My temp solution: I have altered Site.py line 52 to add the list name again into the path for the archives. This halved the number of directories in the /var/mailman/archives/private/ level and pushed the extra directories into their own named sub directory. Now we can create new lists again (in our situation we have the list population updated daily and the lists themselves are added/deleted as required) def get_archpath(listname, domain=None, create=False, public=False): if public: subdir = mm_cfg.PUBLIC_ARCHIVE_FILE_DIR else: subdir = mm_cfg.PRIVATE_ARCHIVE_FILE_DIR path = os.path.join(subdir, listname, listname) if create: _makedir(path) return path Related problems (from the 'fix'): 1. The HTML links are not working for the archive site, but it would be nice to have them functioning. 2. Possible larger ramifications from the alteration of this function that I cannot see yet. Advice from the folks who are a lot more familiar with mailman would be great to point us at a more eloquent solution. ------------------------------------ Antony Somerville Network Programmer / Project Manager: QUT AD Upgrade Project Network Applications Queensland University of Technology, Brisbane Australia Phone +61 7 38644434 Fax +61 7 38642921 From brad at stop.mail-abuse.org Tue Oct 25 02:28:40 2005 From: brad at stop.mail-abuse.org (Brad Knowles) Date: Tue, 25 Oct 2005 02:28:40 +0200 Subject: [Mailman-Developers] Issues with archiving directory and OS limitations In-Reply-To: <200510242352.DVO04821@mail-router01-eth0.qut.edu.au> References: <200510242352.DVO04821@mail-router01-eth0.qut.edu.au> Message-ID: At 9:52 AM +1000 2005-10-25, AE Somerville wrote: > Problem manifests as an inability of the create list process being able to > make the archiving directories. The number appears to be when the directory > count approaches 32,000 separate directories. Most *nix OSes have problems with too many files (or subdirectories) within a given directory structure. Frequently, you start seeing problems at much lower numbers, like 1000 or 10,000. > My temp solution: > > I have altered Site.py line 52 to add the list name again into the path for > the archives. This halved the number of directories in the > /var/mailman/archives/private/ level and pushed the extra directories into > their own named sub directory. Now we can create new lists again (in our > situation we have the list population updated daily and the lists themselves > are added/deleted as required) This just pushes the horizon out. This doesn't solve the fundamental problem. IMO, you're better off doing a quick MD5 hash of the listname and then slicing off the first few (or last) characters of the hash, then incorporating that into the path name. If you use hex characters instead of some other base, that's roughly a factor of sixteen reduction in the number of subdirectories/files for each character of hash. In practice, you'll get birthday collisions more frequently than you'd like, so count it as something closer to a four to eight reduction. With this technique, it doesn't take too many hash characters to greatly reduce the problem to a much more manageable size. Just three characters of a reasonably well distributed hash will result in no more than 4096 hash subdirectories at the parent, and probably something close to a factor of 64 to 512 reduction in the number of grandchild subdirectories/files within each hash subdirectory. If you go with base-32 instead, two base-32 characters would be no more than 1024 files in a single directory, and probably close to a factor of six to 32 reduction in the number of grandchild subdirectories/files per hash subdirectory. Base-64 would let you get two characters creating no more than 4096 hash subdirectories, and you can see the numbers above for the likely reduction in the number of grandchild subdirectories/files. If you need, you can take the hashing another level. It all depends on how cramped you are for space in your filenames, because there are also inode and iname caching issues to consider. -- Brad Knowles, "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety." -- Benjamin Franklin (1706-1790), reply of the Pennsylvania Assembly to the Governor, November 11, 1755 SAGE member since 1995. See for more info. From msapiro at value.net Tue Oct 25 05:25:06 2005 From: msapiro at value.net (Mark Sapiro) Date: Mon, 24 Oct 2005 20:25:06 -0700 Subject: [Mailman-Developers] Issues with archiving directory and OSlimitations In-Reply-To: <200510242352.DVO04821@mail-router01-eth0.qut.edu.au> Message-ID: AE Somerville wrote: > >Related problems (from the 'fix'): > >1. The HTML links are not working for the archive site, but it would be >nice to have them functioning. They don't work because they are constructed using the Archiver.GetBaseArchiveURL() method which doesn't use Site.get_archpath(). For public archives, assuming you haven't changed the default PUBLIC_ARCHIVE_URL = 'http://%(hostname)s/pipermail/%(listname)s' I think you can put PUBLIC_ARCHIVE_URL = 'http://%(hostname)s/pipermail/%(listname)s/%(listname)s' (watchout for wrapped line) in mm_cfg.py to fix. For private archives, you will need to edit the definition of GetBaseArchiveURL() in Mailman/Archiver/Archiver.py or possibly you can make the old URL work with a rewrite rule in your web server. >2. Possible larger ramifications from the alteration of this function >that I cannot see yet. The links in the archive itself are all relative, so that should be OK. I think you're probably OK in general if you fix the stuff in 1), but I haven't really looked hard enough to verify this. Of course, if you patch Archiver.py, you have to maintain the patch across upgrades. >Advice from the folks who are a lot more familiar with mailman would be >great to point us at a more eloquent solution. Brad has addressed your basic solution and suggested ways for further reducing the size of the archives/private and archives/public directories. Of course, you eventually have the same issue with the lists/ directory. -- Mark Sapiro The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan From jwblist at olympus.net Tue Oct 25 20:08:59 2005 From: jwblist at olympus.net (John W. Baxter) Date: Tue, 25 Oct 2005 11:08:59 -0700 Subject: [Mailman-Developers] Issues with archiving directory and OS limitations In-Reply-To: Message-ID: On 10/24/05 5:28 PM, "Brad Knowles" wrote: > Base-64 would let you get two characters creating no more than > 4096 hash subdirectories, and you can see the numbers above for the > likely reduction in the number of grandchild subdirectories/files. Base 64 isn't a good idea for code which might run on case-insensitive file systems (eg Cygwin or Mac OS X). Base 36 would seem safer if this code is going to go into the official Mailman release sometime (which is probably a good idea). --John From adrian at whatifnet.com Tue Oct 25 23:59:12 2005 From: adrian at whatifnet.com (Adrian Wells) Date: Tue, 25 Oct 2005 17:59:12 -0400 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 In-Reply-To: References: Message-ID: "Adrian Wells" on Monday, October 24, 2005 at 1:56 PM +0000 wrote: >I am not proficient in Python and don't completely understand how Mailman >operates so I'm interested in finding some help to understand how >information generated by registerBounce in Bouncer.py is supposed to reach >setBounceInfo in MysqlMemberships.py. Even a general understanding of how >bounce information is processed in Mailman would be helpful for >investigating this. After some time and further testing... it seems that the Mysql MemberAdaptor maybe OK after all, but it is not being fully utilized (or any other member adaptor, for that matter)... The function registerBounce only calls setBounceInfo once (line 116: "self.setBounceInfo(member, info)"). This occurs after testing whether "this is the first bounce we've seen from this member". It would seem as though setBounceInfo should be called a few more times if other conditions are met, right? For example, after determining that the bounce information for a member is valid and is not stale? As a result, I've created a patch that seems to correct the unexpected behavior mentioned in my earlier message. This patch may not cover recording when probes occur or how many probes remain (for example in sendNextNotification). --- Bouncer.py.10.25.2005 2005-10-25 12:21:57.000000000 -0400 +++ Bouncer.py 2005-10-25 13:21:02.000000000 -0400 @@ -137,6 +137,7 @@ if lastbounce + self.bounce_info_stale_after < now: # Information is stale, so simply reset it info.reset(weight, day, self.bounce_you_are_disabled_warnings) + self.setBounceInfo(member, info) syslog('bounce', '%s: %s has stale bounce info, resetting', self.internal_name(), member) else: @@ -144,6 +145,7 @@ # score and take any necessary action. info.score += weight info.date = day + self.setBounceInfo(member, info) syslog('bounce', '%s: %s current bounce score: %s', member, self.internal_name(), info.score) # Continue to the check phase below Please let me know if this is not good or will otherwise cause problems down the line. As a minor side note, I noticed the bounce log receives two different formatted messages for the first bounce and subsequent bounces. An example: ... Oct 25 10:50:51 2005 (2687) samplelist: falseaddresstest at somedomain.net bounce score: 1.0 Oct 25 11:06:54 2005 (2687) falseaddresstest at somedomain.net: samplelist current bounce score: 2.0 ... This is not a major issue but it is inconsistent and it not clear why it should be this way. Is there reason is should be different? Finally, the Mysql MemberAdaptor has a __del__() method. However, it doesn't seem like this is utilized. Searching the Mailman developer's mailing list archives yielded comments from Barry stating that such a method is "not a reliable way to free external resources because you really don't know when Python will call it it, but in this case it might work okay (and may be the only option without some hacking. ;)" . I'm curious, what kind of hacking would be required to reliably close connections? For the sake of full disclosure, I did make a minor change to the MysqlMemberships.py but this should not have affected the issue concerning storing subsequent bounce information. Here is a patch containing for the change made in the adaptor: --- MysqlMemberships.py.10.25.2005 2005-10-25 12:31:02.000000000 -0400 +++ MysqlMemberships.py 2005-10-25 13:14:41.000000000 -0400 @@ -969,8 +969,8 @@ except MySQLdb.Warning, e: syslog("error", "MySQL update warning setting Delivery Status info to '%s' for member '%s' in setBounceInfo()" % (status, member) ) else: - self._prodServerConnection try: + self._prodServerConnection # Hack the dates to work with MySQL. lnsql=(info.lastnotice[0],info.lastnotice[1],info.lastnotice[2],0,0,0,0,0,0) lnsql = time.strftime("%Y-%m-%d", lnsql) -Adrian From msapiro at value.net Wed Oct 26 21:45:47 2005 From: msapiro at value.net (Mark Sapiro) Date: Wed, 26 Oct 2005 12:45:47 -0700 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 In-Reply-To: Message-ID: Adrian Wells wrote: >"Adrian Wells" on Monday, October 24, 2005 at 1:56 >PM +0000 wrote: >>I am not proficient in Python and don't completely understand how Mailman >>operates so I'm interested in finding some help to understand how >>information generated by registerBounce in Bouncer.py is supposed to reach >>setBounceInfo in MysqlMemberships.py. Even a general understanding of how >>bounce information is processed in Mailman would be helpful for >>investigating this. > > >After some time and further testing... it seems that the Mysql >MemberAdaptor maybe OK after all, but it is not being fully utilized (or >any other member adaptor, for that matter)... I think you are correct about the MysqlMemberships.py MemberAdaptor in particular, but Mailman with OldStyleMemberships.py clearly does record subsequent bounces. See below for more. >The function registerBounce only calls setBounceInfo once (line 116: >"self.setBounceInfo(member, info)"). This occurs after testing whether >"this is the first bounce we've seen from this member". It would seem as >though setBounceInfo should be called a few more times if other conditions >are met, right? For example, after determining that the bounce >information for a member is valid and is not stale? I've been looking at this off and on since your first post. I'm kind of "the new kid on the block" here, so even though I think I understand what's going on, I'm not clear on the best way to 'fix' it. What is happening is this. Bouncer.registerBounce calls getBounceInfo to get the bounce info for the member. If there is no bounce info for the member, getBounceInfo returns None and registerBounce creates an instance of the _BounceInfo class and calls setBounceInfo to save it. If there is existing bounce info for the member, getBounceInfo returns the appropriate _BounceInfo class instance which contains the member's info for this list. registerBounce then proceeds to update some attributes of this _BounceInfo instance. Now the tricky part is that in the OldStyleMemberships case, the member _BounceInfo instance is an item in a list of _BounceInfo instances which is the bounce_info attribute of the Mailman list itself. Thus the _BounceInfo instance returned by getBounceInfo is in a sense a pointer into the bounce_info list attribute so when registerBounce changes attributes of the _BounceInfo instance, it is also changing the lists bounce_info attribute so when Save() is ultimately called for the list, the updated bounce info is actually saved. Now, MysqlMemberships.py doesn't work in the same way. its setBounceInfo and getBounceInfo methods take the attributes out of the _BounceInfo instance and store them separately in the database and vice versa, so saving the list doesn't commit any changes that registerBounces may have made to the _BounceInfo instance. >As a result, I've created a patch that seems to correct the unexpected >behavior mentioned in my earlier message. This patch may not cover >recording when probes occur or how many probes remain (for example in >sendNextNotification). > >--- Bouncer.py.10.25.2005 2005-10-25 12:21:57.000000000 -0400 >+++ Bouncer.py 2005-10-25 13:21:02.000000000 -0400 >@@ -137,6 +137,7 @@ > if lastbounce + self.bounce_info_stale_after < now: > # Information is stale, so simply reset it > info.reset(weight, day, >self.bounce_you_are_disabled_warnings) >+ self.setBounceInfo(member, info) > syslog('bounce', '%s: %s has stale bounce info, >resetting', > self.internal_name(), member) > else: >@@ -144,6 +145,7 @@ > # score and take any necessary action. > info.score += weight > info.date = day >+ self.setBounceInfo(member, info) > syslog('bounce', '%s: %s current bounce score: %s', > member, self.internal_name(), info.score) > # Continue to the check phase below > >Please let me know if this is not good or will otherwise cause problems >down the line. It looks good to me, but as you recognize, it's incomplete. As I said, I'm the new kid on the block. It seems to me that this fix is the right way to go, but others may differ. I've worked up a more complete patch which is pasted to the end of this mail. It addresses the other places where the bounce info is changed. I've also searched for places outside Bouncer.py where bounce info is used, and I think they are all OK as is. >As a minor side note, I noticed the bounce log receives two different >formatted messages for the first bounce and subsequent bounces. An >example: >... >Oct 25 10:50:51 2005 (2687) samplelist: falseaddresstest at somedomain.net >bounce score: 1.0 >Oct 25 11:06:54 2005 (2687) falseaddresstest at somedomain.net: samplelist >current bounce score: 2.0 >... >This is not a major issue but it is inconsistent and it not clear why it >should be this way. Is there reason is should be different? I don't think so. All the other log messages from Bouncer are "list: member". I don't see any reason why this one shouldn't also be that way. Here's my patch - watch out for wrapped lines. --- mailman-2.1.6/Mailman/Bouncer.py 2004-12-03 21:01:11.000000000 -0800 +++ mailman-mas/Mailman/Bouncer.py 2005-10-26 12:41:37.984375000 -0700 @@ -146,6 +146,10 @@ info.date = day syslog('bounce', '%s: %s current bounce score: %s', member, self.internal_name(), info.score) + # We've changed info above. In case the MemberAdaptor + # stores bounce info externally to the list, we need + # to tell it to update + self.setBounceInfo(member, info) # Continue to the check phase below # # Now that we've adjusted the bounce score for this bounce, let's @@ -166,6 +170,9 @@ # first bounce, it'll expire by the time we get the disabling bounce. cookie = self.pend_new(Pending.RE_ENABLE, self.internal_name(), member) info.cookie = cookie + # In case the MemberAdaptor stores bounce info externally to + # the list, we need to tell it to save the cookie + self.setBounceInfo(member, info) # Disable them if mm_cfg.VERP_PROBES: syslog('bounce', '%s: %s disabling due to probe bounce received', @@ -271,6 +278,9 @@ msg.send(self) info.noticesleft -= 1 info.lastnotice = time.localtime()[:3] + # In case the MemberAdaptor stores bounce info externally to + # the list, we need to tell it to update + self.setBounceInfo(member, info) def BounceMessage(self, msg, msgdata, e=None): # Bounce a message back to the sender, with an error message if -- Mark Sapiro The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan From fil at rezo.net Wed Oct 26 21:54:01 2005 From: fil at rezo.net (Fil) Date: Wed, 26 Oct 2005 21:54:01 +0200 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 In-Reply-To: References: Message-ID: <20051026195401.GJ25708@rezo.net> As an aside question I would ask: do you notice speed improvement by switching to MySQL-based membership? I have a big list of ~180k subscribers and unfortunately it is now *very* difficult to use the web interface to unsubscribe people. Right now I have to resort to command-line instructions and it's not very practical. I wonder if I should not go the MySQL way, but I'm a bit worried about taking this risk to my databases (I'd hate to reconstruct 70+ lists from a backup). Especially if the switch does not bring a solution. I'd be happy to get advice and maybe even some help if things turn bad, from people who know this piece of patch. -- Fil From adrian at whatifnet.com Wed Oct 26 23:55:13 2005 From: adrian at whatifnet.com (Adrian Wells) Date: Wed, 26 Oct 2005 17:55:13 -0400 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 In-Reply-To: References: Message-ID: Mark Sapiro on Wednesday, October 26, 2005 at 3:45 PM +0000 wrote: >Adrian Wells wrote: > >>"Adrian Wells" on Monday, October 24, 2005 at 1:56 >>PM +0000 wrote: >>>I am not proficient in Python and don't completely understand how >Mailman >>>operates so I'm interested in finding some help to understand how >>>information generated by registerBounce in Bouncer.py is supposed to >reach >>>setBounceInfo in MysqlMemberships.py. Even a general understanding of >how >>>bounce information is processed in Mailman would be helpful for >>>investigating this. >> >> >>After some time and further testing... it seems that the Mysql >>MemberAdaptor maybe OK after all, but it is not being fully utilized (or >>any other member adaptor, for that matter)... > > >I think you are correct about the MysqlMemberships.py MemberAdaptor in >particular, but Mailman with OldStyleMemberships.py clearly does >record subsequent bounces. See below for more. Thank you for the reply. I had not looked much at OldStyleMemberships.py as it's replaced by the MysqlMemberships.py MemberAdaptor in MailList.py, and I'm continuing to slowly learn how Mailman operates. It turns out there is a bug with MysqlMemberships.py MemberAdaptor which deals with retrieving bounce info cookie which is externally stored (something I learned after reading your helpful comments). I've included the patch for MysqlMemberships.py MemberAdaptor at the end of this message. > >>The function registerBounce only calls setBounceInfo once (line 116: >>"self.setBounceInfo(member, info)"). This occurs after testing whether >>"this is the first bounce we've seen from this member". It would seem as >>though setBounceInfo should be called a few more times if other >conditions >>are met, right? For example, after determining that the bounce >>information for a member is valid and is not stale? > > >I've been looking at this off and on since your first post. I'm kind of >"the new kid on the block" here, so even though I think I understand >what's going on, I'm not clear on the best way to 'fix' it. "The new kid on the block"... this sounds like a bit of an understatement but I'll have to try to take your word for it. I'm not sure the best way to 'fix' this either hence the initial post to this list. > >What is happening is this. [ snipped helpful and detailed explanation ] Thank you for the helpful and detailed explanation. > > >Now, MysqlMemberships.py doesn't work in the same way. its >setBounceInfo and getBounceInfo methods take the attributes out of the >_BounceInfo instance and store them separately in the database and >vice versa, so saving the list doesn't commit any changes that >registerBounces may have made to the _BounceInfo instance. OK. I imagine that not much can easily done to change this. > [ snipped my first patch attempt and comments about it] > > >It looks good to me, but as you recognize, it's incomplete. As I said, >I'm the new kid on the block. It seems to me that this fix is the >right way to go, but others may differ. I've worked up a more complete >patch which is pasted to the end of this mail. It addresses the other >places where the bounce info is changed. I've also searched for places >outside Bouncer.py where bounce info is used, and I think they are all >OK as is. Thank you for looking over the patch and for providing a more complete patch. Today, I also found the additional sections in which bounce info is changed (as covered by your patch). However, I think there's a couple additional sections that the supplied patch misses - those are when the bounce information is reset (info.reset()). So I've included another patch at the end of this message which seems to be even more complete. > > >>As a minor side note, I noticed the bounce log receives two different >>formatted messages for the first bounce and subsequent bounces. An >>example: >>... >>Oct 25 10:50:51 2005 (2687) samplelist: falseaddresstest at somedomain.net >>bounce score: 1.0 >>Oct 25 11:06:54 2005 (2687) falseaddresstest at somedomain.net: samplelist >>current bounce score: 2.0 >>... >>This is not a major issue but it is inconsistent and it not clear why it >>should be this way. Is there reason is should be different? > > >I don't think so. All the other log messages from Bouncer are "list: >member". I don't see any reason why this one shouldn't also be that >way. OK. Should this be entered as a bug on SF? > >Here's my patch - watch out for wrapped lines. [ snipped Mark's more complete patch for the sake of brevity (or at least an attempt at it) ] Here's the possibly even more complete patch (as you noted earlier, watch for wrapped lines) for Bouncer.py: --- Bouncer.py.10.25.2005 2005-10-25 12:21:57.000000000 -0400 +++ Bouncer.py 2005-10-26 17:28:46.000000000 -0400 @@ -137,6 +137,10 @@ if lastbounce + self.bounce_info_stale_after < now: # Information is stale, so simply reset it info.reset(weight, day, self.bounce_you_are_disabled_warnings) + # We've changed info above. In case the MemberAdaptor + # stores bounce info externally to the list, we need + # to tell it to update + self.setBounceInfo(member, info) syslog('bounce', '%s: %s has stale bounce info, resetting', self.internal_name(), member) else: @@ -144,6 +148,10 @@ # score and take any necessary action. info.score += weight info.date = day + # We've changed info above. In case the MemberAdaptor + # stores bounce info externally to the list, we need + # to tell it to update + self.setBounceInfo(member, info) syslog('bounce', '%s: %s current bounce score: %s', member, self.internal_name(), info.score) # Continue to the check phase below @@ -158,6 +166,10 @@ self.bounce_score_threshold) self.sendProbe(member, msg) info.reset(0, info.date, info.noticesleft) + # We've changed info above. In case the MemberAdaptor + # stores bounce info externally to the list, we need + # to tell it to update + self.setBounceInfo(member, info) else: self.disableBouncingMember(member, info, msg) @@ -166,6 +178,9 @@ # first bounce, it'll expire by the time we get the disabling bounce. cookie = self.pend_new(Pending.RE_ENABLE, self.internal_name(), member) info.cookie = cookie + # In case the MemberAdaptor stores bounce info externally to + # the list, we need to tell it to save the cookie + self.setBounceInfo(member, info) # Disable them if mm_cfg.VERP_PROBES: syslog('bounce', '%s: %s disabling due to probe bounce received', @@ -271,6 +286,9 @@ msg.send(self) info.noticesleft -= 1 info.lastnotice = time.localtime()[:3] + # In case the MemberAdaptor stores bounce info externally to + # the list, we need to tell it to update + self.setBounceInfo(member, info) def BounceMessage(self, msg, msgdata, e=None): # Bounce a message back to the sender, with an error message if This is the patch for the MysqlMemberships.py MemberAdaptor (note this patch was generated against an already patched/modified version of the Mysql MemberAdaptor 1.61): --- MysqlMemberships.py.10.26.2005 2005-10-26 15:07:54.000000000 -0400 +++ MysqlMemberships.py 2005-10-26 15:11:48.000000000 -0400 @@ -499,7 +499,8 @@ DAYOFMONTH(bi_lastnotice), YEAR(bi_date), MONTH(bi_date), - DAYOFMONTH(bi_date) + DAYOFMONTH(bi_date), + bi_cookie FROM mailman_mysql WHERE listname='%s' AND address = '%s'""" @@ -513,7 +514,8 @@ DAYOFMONTH(bi_lastnotice), YEAR(bi_date), MONTH(bi_date), - DAYOFMONTH(bi_date) + DAYOFMONTH(bi_date), + bi_cookie FROM %s WHERE address = '%s'""" %( self.__mlist.internal_name(), MySQLdb.escape_string(member) ) ) @@ -528,6 +530,7 @@ # Otherwise, populate a bounce_info structure. bounce_info = _BounceInfo(member, row[0], (row[5],row[6],row[7]), row[1]) bounce_info.lastnotice = (row[2],row[3],row[4]) + bounce_info.cookie = row[8] return bounce_info # -Adrian From msapiro at value.net Thu Oct 27 03:06:09 2005 From: msapiro at value.net (Mark Sapiro) Date: Wed, 26 Oct 2005 18:06:09 -0700 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 In-Reply-To: Message-ID: Adrian Wells wrote: >Mark Sapiro on Wednesday, October 26, 2005 at 3:45 PM >+0000 wrote: >> >>Now, MysqlMemberships.py doesn't work in the same way. its >>setBounceInfo and getBounceInfo methods take the attributes out of the >>_BounceInfo instance and store them separately in the database and >>vice versa, so saving the list doesn't commit any changes that >>registerBounces may have made to the _BounceInfo instance. > >OK. I imagine that not much can easily done to change this. I think that's right. The MemberAdaptor is not supposed to be in the business of determining when a _BounceInfo instance has changed behind its back and divining when to commit changes to it. The documentation in MemberAdaptor.py says that the getBounceInfo() method returns the info that was set with setBounceInfo(), so except for what you discovered about the cookie, MysqlMemberships.py appears to be doing the right thing at this level. Actually, it is not really doing the right thing because it is not supposed to be aware of what's in the _BounceInfo class. The info that is passed to it is a string representation of the _BounceInfo instance, and it should really just be saving and retrieving that. IMO, there should be just one column in the MySQL table for this string representation. The only possible snag I see is that the string contains new-lines, and I don't know MySQL so I don't know if new-lines are allowed in a string field/column. If MysqlMemberships.py were just storing and retrieving the representation that it is passed, it wouldn't have to worry about things like the fact that the 'cookie' argument disappeared from the _BounceInfo instantiation call in Mailman 2.1.4 >>It looks good to me, but as you recognize, it's incomplete. As I said, >>I'm the new kid on the block. It seems to me that this fix is the >>right way to go, but others may differ. I've worked up a more complete >>patch which is pasted to the end of this mail. It addresses the other >>places where the bounce info is changed. I've also searched for places >>outside Bouncer.py where bounce info is used, and I think they are all >>OK as is. > > >Thank you for looking over the patch and for providing a more complete >patch. Today, I also found the additional sections in which bounce info >is changed (as covered by your patch). However, I think there's a couple >additional sections that the supplied patch misses - those are when the >bounce information is reset (info.reset()). So I've included another >patch at the end of this message which seems to be even more complete. Yes. I definitely overlooked the info.reset() two lines before the end of registerBounce. Good Catch! However in the earlier part of registerBounce, I deliberately combined your two calls to setBounceInfo() in the "if info is stale" clause and its "else" clause into a single call following the if - else but still within the containing else. I did this even though I think it is logically equivalent, because I think that all else equal, fewer lines is better. >>>As a minor side note, I noticed the bounce log receives two different >>>formatted messages for the first bounce and subsequent bounces. An >>>example: >>>... >>>Oct 25 10:50:51 2005 (2687) samplelist: falseaddresstest at somedomain.net >>>bounce score: 1.0 >>>Oct 25 11:06:54 2005 (2687) falseaddresstest at somedomain.net: samplelist >>>current bounce score: 2.0 >>>... >>>This is not a major issue but it is inconsistent and it not clear why it >>>should be this way. Is there reason is should be different? >> >> >>I don't think so. All the other log messages from Bouncer are "list: >>member". I don't see any reason why this one shouldn't also be that >>way. > > >OK. Should this be entered as a bug on SF? Yes, I think so, but I'd be inclined to wait a bit and see if there are more comments from the list. >This is the patch for the MysqlMemberships.py MemberAdaptor (note this >patch was generated against an already patched/modified version of the >Mysql MemberAdaptor 1.61): As I indicate above, I think the better way to fix MysqlMemberships.py is to remove its knowledge of the _BounceInfo class and just save and retrieve the string representation that it is handed. -- Mark Sapiro The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan From jwblist at olympus.net Thu Oct 27 07:22:42 2005 From: jwblist at olympus.net (John W. Baxter) Date: Wed, 26 Oct 2005 22:22:42 -0700 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 In-Reply-To: Message-ID: On 10/26/05 6:06 PM, "Mark Sapiro" wrote: > Actually, it is not really doing the right thing because it is not > supposed to be aware of what's in the _BounceInfo class. The info that > is passed to it is a string representation of the _BounceInfo > instance, and it should really just be saving and retrieving that. > IMO, there should be just one column in the MySQL table for this > string representation. The only possible snag I see is that the string > contains new-lines, and I don't know MySQL so I don't know if > new-lines are allowed in a string field/column. Based on these tests dashed off using one of Exim's debugging capabilities $ exim -be > ${quote_mysql: A\x0atest} A\ntest > ${quote_mysql: A\x0dtest} A\rtest the newlines are OK but have to be quoted (as do CR characters, and others). This, of course, assumes that Exim's quote_mysql operator is doing the right thing. The best thing would be to check the MySQL documentation (which I'm too lazy to do this evening). --John From adrian at whatifnet.com Thu Oct 27 17:49:40 2005 From: adrian at whatifnet.com (Adrian Wells) Date: Thu, 27 Oct 2005 11:49:40 -0400 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 In-Reply-To: References: Message-ID: "John W. Baxter" on Thursday, October 27, 2005 at 1:22 AM +0000 wrote: >On 10/26/05 6:06 PM, "Mark Sapiro" wrote: > >> Actually, it is not really doing the right thing because it is not >> supposed to be aware of what's in the _BounceInfo class. The info that >> is passed to it is a string representation of the _BounceInfo >> instance, and it should really just be saving and retrieving that. >> IMO, there should be just one column in the MySQL table for this >> string representation. The only possible snag I see is that the string >> contains new-lines, and I don't know MySQL so I don't know if >> new-lines are allowed in a string field/column. > >Based on these tests dashed off using one of Exim's debugging capabilities >$ exim -be >> ${quote_mysql: A\x0atest} > A\ntest >> ${quote_mysql: A\x0dtest} > A\rtest >the newlines are OK but have to be quoted (as do CR characters, and >others). > >This, of course, assumes that Exim's quote_mysql operator is doing the >right >thing. > >The best thing would be to check the MySQL documentation (which I'm too >lazy >to do this evening). Thank you John for the examples and suggestion to reference the MySQL documentation. It appears as though the MySQL VARCHAR type can preserve newlines . However trailing space is removed in this data type. If trailing space must be preserved, one could use the MySQL TEXT type . -Adrian From adrian at whatifnet.com Thu Oct 27 18:01:19 2005 From: adrian at whatifnet.com (Adrian Wells) Date: Thu, 27 Oct 2005 12:01:19 -0400 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 In-Reply-To: References: Message-ID: Mark Sapiro on Wednesday, October 26, 2005 at 9:06 PM +0000 wrote: >Adrian Wells wrote: > >>OK. I imagine that not much can easily done to change this. > > >I think that's right. The MemberAdaptor is not supposed to be in the >business of determining when a _BounceInfo instance has changed behind >its back and divining when to commit changes to it. The documentation >in MemberAdaptor.py says that the getBounceInfo() method returns the >info that was set with setBounceInfo(), so except for what you >discovered about the cookie, MysqlMemberships.py appears to be doing >the right thing at this level. > >Actually, it is not really doing the right thing because it is not >supposed to be aware of what's in the _BounceInfo class. The info that >is passed to it is a string representation of the _BounceInfo >instance, and it should really just be saving and retrieving that. >IMO, there should be just one column in the MySQL table for this >string representation. The only possible snag I see is that the string >contains new-lines, and I don't know MySQL so I don't know if >new-lines are allowed in a string field/column. > >If MysqlMemberships.py were just storing and retrieving the >representation that it is passed, it wouldn't have to worry about >things like the fact that the 'cookie' argument disappeared from the >_BounceInfo instantiation call in Mailman 2.1.4 So, if I understand this, ideally, Bouncer.py should not be changed to include additional calls to setBounceInfo(), right? I'm still trying to understand whether this is a hack to allow MysqlMemberships.py to work as is (more or less) OR if setBounceInfo() calls were original "missing" in Bouncer.py. I gather the answer is the former. I have done some searching in the mailman-developers' archives to try to understand why it was decided to separate BounceInfo. Here are some findings: : "I'm putting the "info" parameter from setBounceInfo directly into the database, which I think is an array itself, not a single value, and the above doesn't look like Python's just traversing an array, and dumping it into the database(the LHS names don't tie up with what I think are the keys for the subelements of "info"), so it looks like I'll have to take a "best guess" at how to implement this." > : "...the only changes of any import that I've made are that the Member data structures are stored in a way that fits MySQL and converted as they are loaded to the way that fits Mailman, which you'd expect..." I surmise that the rationale for storing the BounceInfo in separate columns is to provide easier access via SQL queries to the information that would otherwise be stored in this object. I can imagine where this would be desirable (e.g. quickly querying which members recently received an increased bounce score). > > >>Thank you for looking over the patch and for providing a more complete >>patch. Today, I also found the additional sections in which bounce info >>is changed (as covered by your patch). However, I think there's a couple >>additional sections that the supplied patch misses - those are when the >>bounce information is reset (info.reset()). So I've included another >>patch at the end of this message which seems to be even more complete. > > >Yes. I definitely overlooked the info.reset() two lines before the end >of registerBounce. Good Catch! > >However in the earlier part of registerBounce, I deliberately combined >your two calls to setBounceInfo() in the "if info is stale" clause and >its "else" clause into a single call following the if - else but still >within the containing else. > >I did this even though I think it is logically equivalent, because I >think that all else equal, fewer lines is better. Agreed. Fewer lines is preferred. I apologize for not recognizing what you had done there. > > >>>>As a minor side note, I noticed the bounce log receives two different >>>>formatted messages for the first bounce and subsequent bounces. An >>>>example: >>>>... >>>>Oct 25 10:50:51 2005 (2687) samplelist: falseaddresstest at somedomain.net >>>>bounce score: 1.0 >>>>Oct 25 11:06:54 2005 (2687) falseaddresstest at somedomain.net: samplelist >>>>current bounce score: 2.0 >>>>... >>>>This is not a major issue but it is inconsistent and it not clear why >it >>>>should be this way. Is there reason is should be different? >>> >>> >>>I don't think so. All the other log messages from Bouncer are "list: >>>member". I don't see any reason why this one shouldn't also be that >>>way. >> >> >>OK. Should this be entered as a bug on SF? > > >Yes, I think so, but I'd be inclined to wait a bit and see if there are >more comments from the list. I have no problem waiting. > > >>This is the patch for the MysqlMemberships.py MemberAdaptor (note this >>patch was generated against an already patched/modified version of the >>Mysql MemberAdaptor 1.61): > > >As I indicate above, I think the better way to fix MysqlMemberships.py >is to remove its knowledge of the _BounceInfo class and just save and >retrieve the string representation that it is handed. : "My suggestion would be to pickle the BounceInfo object on the way into the database, and unpickle it on the way out." Or pickle and unpickle this information, right? Making this change, of course, will require more effort to extract information stored in MySQL for other purposes (e.g. a custom web interface) but if it's the best way to handle this information then I would consider making these changes. I will try like to discuss this with the original author of MysqlMemberships.py. -Adrian From msapiro at value.net Thu Oct 27 18:25:22 2005 From: msapiro at value.net (Mark Sapiro) Date: Thu, 27 Oct 2005 09:25:22 -0700 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 In-Reply-To: Message-ID: Adrian Wells wrote: > >It appears as though the MySQL VARCHAR type can preserve newlines >. However trailing >space is removed in this data type. If trailing space must be preserved, >one could use the MySQL TEXT type >. There is no trailing white space in the string representation of a _BounceInfo instance. See the __repr__ method in the _BounceInfo class in Bouncer.py for details of what it is. There is an issue however in that the representation can exceed 255 characters if there is a cookie and/or a longish member address. The first reference above seems unclear on this. It says Values in VARCHAR columns are variable-length strings. The length can be specified as 1 to 255 before MySQL 4.0.2 and 0 to 255 as of MySQL 4.0.2. but the next paragraph says In contrast to CHAR, VARCHAR values are stored using only as many characters as are needed, plus one byte to record the length (two bytes for columns that are declared with a length longer than 255). I don't know if means you can actually have VARCHAR data longer than 255 or not. -- Mark Sapiro The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan From fil at rezo.net Thu Oct 27 18:41:12 2005 From: fil at rezo.net (Fil) Date: Thu, 27 Oct 2005 18:41:12 +0200 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 In-Reply-To: References: Message-ID: <20051027164112.GD25708@rezo.net> A lot of good work on this MySQLMemberAdaptor! If you think it can make things easier and faster I'm willing to open a subversion repository to develop Mailman patches and plugins. Please tell. (I already have one running multiple projects on my server, so one more cannot hurt. It uses trac as a web frontend) -- Fil From jdennis at redhat.com Thu Oct 27 20:03:04 2005 From: jdennis at redhat.com (John Dennis) Date: Thu, 27 Oct 2005 14:03:04 -0400 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 In-Reply-To: References: Message-ID: <1130436184.7801.103.camel@finch.boston.redhat.com> On Thu, 2005-10-27 at 09:25 -0700, Mark Sapiro wrote: > There is an issue however in that the representation can exceed 255 > characters if there is a cookie and/or a longish member address. > > The first reference above seems unclear on this. It says > > Values in VARCHAR columns are variable-length strings. The length can > be specified as 1 to 255 before MySQL 4.0.2 and 0 to 255 as of MySQL > 4.0.2. > > but the next paragraph says > > In contrast to CHAR, VARCHAR values are stored using only as many > characters as are needed, plus one byte to record the length (two > bytes for columns that are declared with a length longer than 255). > > I don't know if means you can actually have VARCHAR data longer than > 255 or not. VARCHAR is upper bound limited, you cannot store more than 255 characters in a VARCHAR. However the TEXT data type are virtually unlimited (as is the BLOB, or Binary Large Object, don't you love the name?) But since the bounce data is converted to text the TEXT data type makes more sense than BLOB. -- John Dennis From msapiro at value.net Thu Oct 27 20:18:58 2005 From: msapiro at value.net (Mark Sapiro) Date: Thu, 27 Oct 2005 11:18:58 -0700 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 In-Reply-To: Message-ID: Adrian Wells wrote: > >Mark Sapiro on Wednesday, October 26, 2005 at 9:06 PM >+0000 wrote: >> >>Actually, it is not really doing the right thing because it is not >>supposed to be aware of what's in the _BounceInfo class. The info that >>is passed to it is a string representation of the _BounceInfo >>instance, and it should really just be saving and retrieving that. >>IMO, there should be just one column in the MySQL table for this >>string representation. The only possible snag I see is that the string >>contains new-lines, and I don't know MySQL so I don't know if >>new-lines are allowed in a string field/column. >> >>If MysqlMemberships.py were just storing and retrieving the >>representation that it is passed, it wouldn't have to worry about >>things like the fact that the 'cookie' argument disappeared from the >>_BounceInfo instantiation call in Mailman 2.1.4 > > >So, if I understand this, ideally, Bouncer.py should not be changed to >include additional calls to setBounceInfo(), right? I'm still trying to >understand whether this is a hack to allow MysqlMemberships.py to work as >is (more or less) OR if setBounceInfo() calls were original "missing" in >Bouncer.py. I gather the answer is the former. No. I think the additional calls to setBounceInfo() are required for any MemberAdaptor that doesn't store the bounce info in a list attribute. I.e., they ARE required for MysqlMemberships.py, but they should be minimized because for MysqlMemberships.py and perhaps other MemberAdaptors they involve database access which may be relatively expensive. I think the patch we arrived at does achieve the minimum. The issue with the existing MysqlMemberships.py is that it should not be burdened with knowing any details about the bounce info. As the documentation (in MemberAdaptor.py) says, bounce info is opaque to the MemberAdaptor. It is set by setBounceInfo() and returned by getBounceInfo() without modification. Obviously, the MemberAdaptor has to know enough about the bounce info it gets (e.g., maximum length) so it can store and return it without modification, and getBounceInfo() has to know to return None when there is no previous bounce info for the member, but that should be it. The lengths to which MysqlMemberships.py goes to extract attributes from the bounce info, save them separately, and construct a _BounceInfo instance to return the data only get it in trouble when aspects of the _BounceInfo class change from version to version. >I have done some searching in the mailman-developers' archives to try to >understand why it was decided to separate BounceInfo. Here are some >findings: > >: >"I'm putting the "info" parameter from setBounceInfo directly into the >database, which I think is an array itself, not a single value, and the >above doesn't look like Python's just traversing an array, and dumping it >into the database(the LHS names don't tie up with what I think are the >keys for the subelements of "info"), so it looks like I'll have to take a >"best guess" at how to implement this." >> > >: >"...the only changes of any import that I've made are that the Member data >structures are stored in a way that fits MySQL and converted as they are >loaded to the way that fits Mailman, which you'd expect..." > >I surmise that the rationale for storing the BounceInfo in separate >columns is to provide easier access via SQL queries to the information >that would otherwise be stored in this object. I can imagine where this >would be desirable (e.g. quickly querying which members recently received >an increased bounce score). I can see that would be desirable, but the price is difficulty of maintenance because then the get and set BounceInfo methods have to know things about the _BounceInfo class that may change, however see below for a compromise. >>Yes. I definitely overlooked the info.reset() two lines before the end >>of registerBounce. Good Catch! >> >>However in the earlier part of registerBounce, I deliberately combined >>your two calls to setBounceInfo() in the "if info is stale" clause and >>its "else" clause into a single call following the if - else but still >>within the containing else. >> >>I did this even though I think it is logically equivalent, because I >>think that all else equal, fewer lines is better. > > >Agreed. Fewer lines is preferred. I apologize for not recognizing what >you had done there. No problem. >>As I indicate above, I think the better way to fix MysqlMemberships.py >>is to remove its knowledge of the _BounceInfo class and just save and >>retrieve the string representation that it is handed. See below. > > >: >"My suggestion would be to pickle the BounceInfo object on the way into >the database, and unpickle it on the way out." > >Or pickle and unpickle this information, right? Making this change, of >course, will require more effort to extract information stored in MySQL >for other purposes (e.g. a custom web interface) but if it's the best way >to handle this information then I would consider making these changes. I >will try like to discuss this with the original author of >MysqlMemberships.py. Looking at this more deeply, I think it is not as simple as I first thought to simply save the string representation and return it. I think pickle.dump() to a StringIO file and then saving its contents in the set... method and then retrieving the string and returning pickle.loads() in the get... method is the way to go. It has the advantage that the string written to the database is a bit shorter too. If you also want to be able to see and use some of the bounce info, you could save certain attributes of the _BounceInfo instance in additional columns of the database table. I think discussing with the author is a good idea. -- Mark Sapiro The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan From fil at rezo.net Fri Oct 28 23:28:22 2005 From: fil at rezo.net (Fil) Date: Fri, 28 Oct 2005 23:28:22 +0200 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 In-Reply-To: References: Message-ID: <20051028212821.GS9735@rezo.net> We are currently two on irc on the #mailman channel trying to install this patch on our respective systems. Could anyone drop by and/or tell us where to find the latest series of updated files? thanks in advance -- Fil From fil at rezo.net Sat Oct 29 01:35:36 2005 From: fil at rezo.net (Fil) Date: Sat, 29 Oct 2005 01:35:36 +0200 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 In-Reply-To: <20051028212821.GS9735@rezo.net> References: <20051028212821.GS9735@rezo.net> Message-ID: <20051028233536.GW9735@rezo.net> There is an issue with the "flat" mode, when you create the db according to the README file and choose PRIMARY KEY (listname, address) MySQL brings up the following error : ERROR 1071: Specified key was too long. Max key length is 500 So in fact the initial definitions of address varchar(255) NOT NULL, listname varchar(255) NOT NULL, do not work, you need to get under the 500 limit (I chose to kepp 255 for the address, and 100 for the listname) -- Fil From fil at rezo.net Sat Oct 29 01:49:47 2005 From: fil at rezo.net (Fil) Date: Sat, 29 Oct 2005 01:49:47 +0200 Subject: [Mailman-Developers] Mysql MemberAdaptor 1.61 and Mailman 2.1.6 In-Reply-To: <20051028233536.GW9735@rezo.net> References: <20051028212821.GS9735@rezo.net> <20051028233536.GW9735@rezo.net> Message-ID: <20051028234947.GX9735@rezo.net> Okay I found out how to install this stuff for just one list (test list): use extend.py mechanism with: lists/test/extend.py containing the following lines: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> # import the MySQL stuff from Mailman.MysqlMemberships import MysqlMemberships # override the default for this list def extend(mlist): mlist._memberadaptor = MysqlMemberships(mlist) <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< hope this helps -- Fil From fil at rezo.net Sat Oct 29 16:49:56 2005 From: fil at rezo.net (Fil) Date: Sat, 29 Oct 2005 16:49:56 +0200 Subject: [Mailman-Developers] a patch to scale Cgi/admin.py Message-ID: <20051029144955.GG27667@rezo.net> I'm knee-deep in Mailman/Gui/admin.py and it really doesn't scale. I use a test-list of 300k addresses, and it's a bit more than 5 minutes to get it to answer (if the connection holds that long, of course). It's particularly true when using the MySQLMemberAdaptor, where many things are not taken from memory but are reprocessed with MySQL queries. For instance, the part that checks if the member is regular/digest fetches fetches all the data for each subscriber (more plainly said, it's in N^2). Another bottleneck is the list of chunks that is computed and displayed (and sent to the client) - his list is quite long to compute, and as a user it's not that useful in general. Last but not least, the search facility calls the mysql-db for each member, in order to extract her name and regexp it; and that's very long. Is wasn't able to find how to speed this up, and just disabled it in my system (but not in the patch provided below) So here are a few small changes, that make a radical improvement (down to 45 seconds from 4 minutes): --- /home/fil/src_mailman/mailman/Mailman/Cgi/admin.py 2005-02-12 21:22:55.000000000 +0100 +++ Mailman/Cgi/admin.py 2005-10-29 16:43:56.116988176 +0200 @@ -876,6 +876,7 @@ def membership_options(mlist, subcat, cg doc.addError(_('Bad regular expression: ') + regexp) else: # BAW: There's got to be a more efficient way of doing this! + # yes please... this doesn't scale at all names = [mlist.getMemberName(s) or '' for s in all] all = [a for n, a in zip(names, all) if cre.search(n) or cre.search(a)] @@ -978,6 +979,8 @@ def membership_options(mlist, subcat, cg MemberAdaptor.BYADMIN : _('A'), MemberAdaptor.BYBOUNCE: _('B'), } + # memorize the regular-or-digest list + regular_or_digest = mlist.getRegularMemberKeys() # Now populate the rows for addr in members: link = Link(mlist.GetOptionsURL(addr, obscure=1), @@ -1021,8 +1024,8 @@ def membership_options(mlist, subcat, cg # This code is less efficient than the original which did a has_key on # the underlying dictionary attribute. This version is slower and # less memory efficient. It points to a new MemberAdaptor interface - # method. - if addr in mlist.getRegularMemberKeys(): + # method. (Modified by Fil to "cache" the result - useful for MySQLMemberAdaptor) + if addr in regular_or_digest: cells.append(Center(CheckBox(addr + '_digest', 'off', 0).Format())) else: cells.append(Center(CheckBox(addr + '_digest', 'on', 1).Format())) @@ -1113,7 +1116,7 @@ def membership_options(mlist, subcat, cg range listed below:''') chunkmembers = buckets[bucket] last = len(chunkmembers) - for i in range(numchunks): + for i in range(min(10,numchunks)): if i == chunkindex: continue start = chunkmembers[i*chunksz] -- Fil From jag at fsf.org Sun Oct 30 18:47:09 2005 From: jag at fsf.org (Joshua Ginsberg) Date: Sun, 30 Oct 2005 12:47:09 -0500 Subject: [Mailman-Developers] a patch to scale Cgi/admin.py In-Reply-To: <20051029144955.GG27667@rezo.net> References: <20051029144955.GG27667@rezo.net> Message-ID: <1130694429.27687.3.camel@localhost.localdomain> What it sounds like you really want in order to minimize database I/O is to implement an in-memory caching system on top of the various methods of the MemberAdaptor. So you'd have per MySQLMemberAdaptor object a dictionary keyed the same as the database table with dictionaries for the various fields per subscriber. If there is a KeyError when trying to access the dictionary, hit the database. If the database returns no rows, then you raise NotAMemberError or return whatever may be appropriate. True, this would only be effective per connection or per post, but it seems to be the most efficient means of maximizing scalability. YMMV. -jag On Sat, 2005-10-29 at 16:49 +0200, Fil wrote: > I'm knee-deep in Mailman/Gui/admin.py and it really doesn't scale. > > I use a test-list of 300k addresses, and it's a bit more than 5 minutes to > get it to answer (if the connection holds that long, of course). > > It's particularly true when using the MySQLMemberAdaptor, where many things > are not taken from memory but are reprocessed with MySQL queries. > For instance, the part that checks if the member is regular/digest fetches > fetches all the data for each subscriber (more plainly said, it's in N^2). > > Another bottleneck is the list of chunks that is computed and displayed (and > sent to the client) - his list is quite long to compute, and as a user it's > not that useful in general. > > Last but not least, the search facility calls the mysql-db for each member, > in order to extract her name and regexp it; and that's very long. Is wasn't > able to find how to speed this up, and just disabled it in my system (but > not in the patch provided below) > > So here are a few small changes, that make a radical improvement (down to 45 > seconds from 4 minutes): > > --- /home/fil/src_mailman/mailman/Mailman/Cgi/admin.py 2005-02-12 21:22:55.000000000 +0100 > +++ Mailman/Cgi/admin.py 2005-10-29 16:43:56.116988176 +0200 > @@ -876,6 +876,7 @@ def membership_options(mlist, subcat, cg > doc.addError(_('Bad regular expression: ') + regexp) > else: > # BAW: There's got to be a more efficient way of doing this! > + # yes please... this doesn't scale at all > names = [mlist.getMemberName(s) or '' for s in all] > all = [a for n, a in zip(names, all) > if cre.search(n) or cre.search(a)] > @@ -978,6 +979,8 @@ def membership_options(mlist, subcat, cg > MemberAdaptor.BYADMIN : _('A'), > MemberAdaptor.BYBOUNCE: _('B'), > } > + # memorize the regular-or-digest list > + regular_or_digest = mlist.getRegularMemberKeys() > # Now populate the rows > for addr in members: > link = Link(mlist.GetOptionsURL(addr, obscure=1), > @@ -1021,8 +1024,8 @@ def membership_options(mlist, subcat, cg > # This code is less efficient than the original which did a has_key on > # the underlying dictionary attribute. This version is slower and > # less memory efficient. It points to a new MemberAdaptor interface > - # method. > - if addr in mlist.getRegularMemberKeys(): > + # method. (Modified by Fil to "cache" the result - useful for MySQLMemberAdaptor) > + if addr in regular_or_digest: > cells.append(Center(CheckBox(addr + '_digest', 'off', 0).Format())) > else: > cells.append(Center(CheckBox(addr + '_digest', 'on', 1).Format())) > @@ -1113,7 +1116,7 @@ def membership_options(mlist, subcat, cg > range listed below:''') > chunkmembers = buckets[bucket] > last = len(chunkmembers) > - for i in range(numchunks): > + for i in range(min(10,numchunks)): > if i == chunkindex: > continue > start = chunkmembers[i*chunksz] > > > -- Fil > > _______________________________________________ > Mailman-Developers mailing list > Mailman-Developers at python.org > http://mail.python.org/mailman/listinfo/mailman-developers > Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py > Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ > Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/jag%40fsf.org > > Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp > -- Joshua Ginsberg Free Software Foundation - Senior Systems Administrator -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20051030/658dc4e8/attachment.pgp From fil at rezo.net Sun Oct 30 20:50:43 2005 From: fil at rezo.net (Fil) Date: Sun, 30 Oct 2005 20:50:43 +0100 Subject: [Mailman-Developers] a patch to scale Cgi/admin.py In-Reply-To: <1130694429.27687.3.camel@localhost.localdomain> References: <20051029144955.GG27667@rezo.net> <1130694429.27687.3.camel@localhost.localdomain> Message-ID: <20051030195043.GB26891@rezo.net> > What it sounds like you really want in order to minimize database I/O is > to implement an in-memory caching system on top of the various methods > of the MemberAdaptor. So you'd have per MySQLMemberAdaptor object a > dictionary keyed the same as the database table with dictionaries for > the various fields per subscriber. If there is a KeyError when trying to > access the dictionary, hit the database. If the database returns no > rows, then you raise NotAMemberError or return whatever may be > appropriate. I'd love to see it happen, but you also have to be careful (speaking of scaling, not for my own uses which are served at < 300 k lists) of not reaching a memory limit. If your lists have 10 million subscribers (say), you don't want to load the whole list in memory just to retreive one address. This is the DB's job (ie MySQL itself, or OldStyleMemberAdaptor.py when using the usual db), not Mailman's job per se. In parallel I have another idea that could be somehow faster for the members page: instead of splitting the list into "buckets", and restricting the display to a chunk of a bucket, just get rid of buckets, and chunk wherever in the list. And, in order to get the functionality of "buckets" back, just add the initial letters in the list of links to the different "chunks". The links to a specific chunk would be styled as /members/?start=jane at doe.com and the display test would be if ( addr >= start ) { prepare the display } Not sure if my English makes sense, I'll just post the code when I'm done pythonizing the idea. I don't even know how to compare two strings in python, so it might take a little while :-D Note that this would lose no functionality: it may even be a bit more useful as UI for medium-sized lists of ~100 subscribers -- currently if your list holds 2 addresses starting by "a", two by "b" and so on, you have to check 26 pages of subscribers, whereas you really need just two (the first 40, and the last 12). -- Fil From fil at rezo.net Mon Oct 31 01:19:02 2005 From: fil at rezo.net (Fil) Date: Mon, 31 Oct 2005 01:19:02 +0100 Subject: [Mailman-Developers] a patch to scale Cgi/admin.py In-Reply-To: <20051030195043.GB26891@rezo.net> References: <20051029144955.GG27667@rezo.net> <1130694429.27687.3.camel@localhost.localdomain> <20051030195043.GB26891@rezo.net> Message-ID: <20051031001902.GC7918@rezo.net> > The links to a specific chunk would be styled as > /members/?start=jane at doe.com > > Not sure if my English makes sense, I'll just post the code when I'm done > pythonizing the idea. I don't even know how to compare two strings in > python, so it might take a little while :-D Okay, now it's done -- it's just a functionality rewrite, nothing is lost except a few lines of code :) Enclosed is the patch + the patched file. (I'm not fully in sync with the CVS as I got an error upgrading from 2.1.6b to 2.1-Maint) If you want to try it it's simple and can't do much harm, as it's only affecting the Web GUI - you don't have to restart Mailman, just save Mailman/Cgi/admin.py aside (in case), and replace it with this one. Note that I also removed the annoying "language" menu when there's only one language available. (BTW Something I'd like to add is a 'title="jane at doe.com"' attribute in the element, but I couldn't find how to do it.) * * * We'll still need to solve the "search" issue, but that will require much more work, I think, as the best way to do it will be to implement a new method in the memberadaptor; and that will need discussion, as there are two options: - 1) add a "getMembersMatching(regexp) method (best, I think, as it can leverage foreign search methods, i.e. MySQL's "SELECT WHERE name LIKE %s") - 2) add a "getMembersWithNames()" method (not so good, but for the sake oif the discussion I include the idea here) Please tell me which route to take, or I'll take Route 1. -- Fil -------------- next part -------------- A non-text attachment was scrubbed... Name: mailman_Cgi_admin.py Type: text/x-python Size: 62819 bytes Desc: not available Url : http://mail.python.org/pipermail/mailman-developers/attachments/20051031/2137b527/mailman_Cgi_admin-0001.py -------------- next part -------------- --- Mailman/Cgi/admin.py 2005-02-12 21:22:55.000000000 +0100 +++ /var/local/mailman/Mailman/Cgi/admin.py 2005-10-31 00:57:05.000000000 +0100 @@ -28,6 +28,7 @@ import urllib import signal from types import * from string import lowercase, digits +from urllib import urlencode from email.Utils import unquote, parseaddr, formataddr @@ -47,7 +48,6 @@ _ = i18n._ i18n.set_language(mm_cfg.DEFAULT_SERVER_LANGUAGE) NL = '\n' -OPTCOLUMNS = 11 try: True, False @@ -876,93 +876,97 @@ def membership_options(mlist, subcat, cg doc.addError(_('Bad regular expression: ') + regexp) else: # BAW: There's got to be a more efficient way of doing this! + # yes please... this doesn't scale at all names = [mlist.getMemberName(s) or '' for s in all] all = [a for n, a in zip(names, all) if cre.search(n) or cre.search(a)] - chunkindex = None - bucket = None - actionurl = None + + # Do we want to display the language menu? + langs = mlist.GetAvailableLanguages() + if len(langs) > 1: + langdescs = [_(Utils.GetLanguageDescr(lang)) for lang in langs] + OPTCOLUMNS = 11 + else: + langs = None + OPTCOLUMNS = 10 + options = [_('unsub'), _('member address
member name'), + _('mod'), _('hide'), _('nomail
[reason]'), _('ack'), + _('not metoo'), _('nodupes'), _('digest'), _('plain')] + if langs: + options.append(_('language')) + + starters = [] + + # List all members that we want to display + # maybe everyone if they are not too many if len(all) < chunksz: members = all else: - # Split them up alphabetically, and then split the alphabetical - # listing by chunks - buckets = {} - for addr in all: - members = buckets.setdefault(addr[0].lower(), []) - members.append(addr) - # Now figure out which bucket we want - bucket = None - qs = {} - # POST methods, even if their actions have a query string, don't get - # put into FieldStorage's keys :-( + # Retrieve the start address + # BAW: POST methods, even if their actions have a query string, don't + # get put into FieldStorage's keys :-( + start = '' qsenviron = os.environ.get('QUERY_STRING') if qsenviron: qs = cgi.parse_qs(qsenviron) - bucket = qs.get('letter', 'a')[0].lower() - if bucket not in digits + lowercase: - bucket = None - if not bucket or not buckets.has_key(bucket): - keys = buckets.keys() - keys.sort() - bucket = keys[0] - members = buckets[bucket] - action = adminurl + '/members?letter=%s' % bucket - if len(members) <= chunksz: - form.set_action(action) - else: - i, r = divmod(len(members), chunksz) - numchunks = i + (not not r * 1) - # Now chunk them up - chunkindex = 0 - if qs.has_key('chunk'): - try: - chunkindex = int(qs['chunk'][0]) - except ValueError: - chunkindex = 0 - if chunkindex < 0 or chunkindex > numchunks: - chunkindex = 0 - members = members[chunkindex*chunksz:(chunkindex+1)*chunksz] - # And set the action URL - form.set_action(action + '&chunk=%s' % chunkindex) - # So now members holds all the addresses we're going to display - allcnt = len(all) - if bucket: - membercnt = len(members) - usertable.AddRow([Center(Italic(_( - '%(allcnt)s members total, %(membercnt)s shown')))]) - else: - usertable.AddRow([Center(Italic(_('%(allcnt)s members total')))]) - usertable.AddCellInfo(usertable.GetCurrentRowIndex(), + if qs.has_key('start'): + start = qs.get('start')[0].lower() + + # Show start links for every address that is either starting a bucket + # or, inside the current bucket, starting a chunk + num = 0 + numtaken = 0 + members = [] + if all: + bucket = all[0][0].upper() # first bucket, ignore + try: + currentbucket = start[0].upper() + except: + currentbucket = '' + + for addr in all: + # If the address changes of bucket, or is in the same bucket + # as the start parameter, and a multiple of chunksize addresses + # write a link to it + reason = '' + if addr[0].upper() != bucket: + bucket = addr[0].upper() + reason = '' + bucket + '' + num = 0 + elif num % chunksz == 0 and bucket == currentbucket: + reason = '' + str(num/chunksz) + '' + if reason: + if start == addr: + link = reason + else: + url = adminurl + '/members?' + urlencode({'start':addr}) + link = Link(url, reason).Format() + starters.append(link) + num += 1 + + # If the address is after the START value, take it for display + # but don't take more than the max chunksize + if addr.lower() >= start and numtaken < chunksz: + numtaken += 1 + members.append(addr) + + + # Add the start links + if starters: + joiner = ' ' + '\n' + usertable.AddRow([Center(joiner.join(starters))]) + usertable.AddCellInfo(usertable.GetCurrentRowIndex(), usertable.GetCurrentCellIndex(), colspan=OPTCOLUMNS, bgcolor=mm_cfg.WEB_ADMINITEM_COLOR) - # Add the alphabetical links - if bucket: - cells = [] - for letter in digits + lowercase: - if not buckets.get(letter): - continue - url = adminurl + '/members?letter=%s' % letter - if letter == bucket: - show = Bold('[%s]' % letter.upper()).Format() - else: - show = letter.upper() - cells.append(Link(url, show).Format()) - joiner = ' '*2 + '\n' - usertable.AddRow([Center(joiner.join(cells))]) + + # So now members holds all the addresses we're going to display + usertable.AddRow('') usertable.AddCellInfo(usertable.GetCurrentRowIndex(), usertable.GetCurrentCellIndex(), colspan=OPTCOLUMNS, bgcolor=mm_cfg.WEB_ADMINITEM_COLOR) - usertable.AddRow([Center(h) for h in (_('unsub'), - _('member address
member name'), - _('mod'), _('hide'), - _('nomail
[reason]'), - _('ack'), _('not metoo'), - _('nodupes'), - _('digest'), _('plain'), - _('language'))]) + usertable.AddRow([Center(h) for h in options]) rowindex = usertable.GetCurrentRowIndex() for i in range(OPTCOLUMNS): usertable.AddCellInfo(rowindex, i, bgcolor=mm_cfg.WEB_ADMINITEM_COLOR) @@ -978,6 +982,8 @@ def membership_options(mlist, subcat, cg MemberAdaptor.BYADMIN : _('A'), MemberAdaptor.BYBOUNCE: _('B'), } + # memorize the regular-or-digest list + regular_or_digest = mlist.getRegularMemberKeys() # Now populate the rows for addr in members: link = Link(mlist.GetOptionsURL(addr, obscure=1), @@ -1021,8 +1027,9 @@ def membership_options(mlist, subcat, cg # This code is less efficient than the original which did a has_key on # the underlying dictionary attribute. This version is slower and # less memory efficient. It points to a new MemberAdaptor interface - # method. - if addr in mlist.getRegularMemberKeys(): + # method. (Modified by Fil to "cache" the result - useful for + # MySQLMemberAdaptor) + if addr in regular_or_digest: cells.append(Center(CheckBox(addr + '_digest', 'off', 0).Format())) else: cells.append(Center(CheckBox(addr + '_digest', 'on', 1).Format())) @@ -1035,13 +1042,12 @@ def membership_options(mlist, subcat, cg cells.append(Center(CheckBox('%s_plain' % addr, value, checked))) # User's preferred language langpref = mlist.getMemberLanguage(addr) - langs = mlist.GetAvailableLanguages() - langdescs = [_(Utils.GetLanguageDescr(lang)) for lang in langs] - try: - selected = langs.index(langpref) - except ValueError: - selected = 0 - cells.append(Center(SelectOptions(addr + '_language', langs, + if langs: + try: + selected = langs.index(langpref) + except ValueError: + selected = 0 + cells.append(Center(SelectOptions(addr + '_language', langs, langdescs, selected)).Format()) usertable.AddRow(cells) # Add the usertable and a legend @@ -1105,23 +1111,6 @@ def membership_options(mlist, subcat, cg _('Click here to include the legend for this table.'))) container.AddItem(Center(usertable)) - # There may be additional chunks - if chunkindex is not None: - buttons = [] - url = adminurl + '/members?%sletter=%s&' % (addlegend, bucket) - footer = _('''

To view more members, click on the appropriate - range listed below:''') - chunkmembers = buckets[bucket] - last = len(chunkmembers) - for i in range(numchunks): - if i == chunkindex: - continue - start = chunkmembers[i*chunksz] - end = chunkmembers[min((i+1)*chunksz, last)-1] - link = Link(url + 'chunk=%d' % i, _('from %(start)s to %(end)s')) - buttons.append(link) - buttons = UnorderedList(*buttons) - container.AddItem(footer + buttons.Format() + '

') return container From fil at rezo.net Mon Oct 31 02:49:47 2005 From: fil at rezo.net (Fil) Date: Mon, 31 Oct 2005 02:49:47 +0100 Subject: [Mailman-Developers] a patch to scale Cgi/admin.py In-Reply-To: <20051031001902.GC7918@rezo.net> References: <20051029144955.GG27667@rezo.net> <1130694429.27687.3.camel@localhost.localdomain> <20051030195043.GB26891@rezo.net> <20051031001902.GC7918@rezo.net> Message-ID: <20051031014947.GE7918@rezo.net> > - 1) add a "getMembersMatching(regexp) method (best, I think, as it can > leverage foreign search methods, i.e. MySQL's "SELECT WHERE name LIKE %s") I have implemented this for the MySQLMemberAdaptor, and it's a fabulous speed improvement. My question to Barry would be now: do I need to make this a compulsory method for the MemberAdaptor class (and declare this function in MemberAdaptor.py), like: def getMembersMatching(self, regexp): """Get all the members who match regexp""" raise NotImplementedError or is it enough to just "fall back" to the previous algorithm in case this method doesn't exist (and then I need patch only admin.py and MySQLMemberAdaptor, which I have done)? -- Fil From fil at rezo.net Mon Oct 31 16:49:53 2005 From: fil at rezo.net (Fil) Date: Mon, 31 Oct 2005 16:49:53 +0100 Subject: [Mailman-Developers] an issue with MySQLMemberAdaptor Message-ID: <20051031154953.GX7918@rezo.net> Hi, I'm continuing my testing of the MySQLMemberAdaptor, and I found out that, if the database is down when someone wants to subscribe (or confirm subscription), for example, the messages are "shunted". This could be a problem when doing maintenance. It would be nice if Mailman could fwd the shunted messages to the listserv administrator, so she can decide what to do with it (i.e., reinstate the connection, and unshunt the files -- not just forget that the shunt directory exists). -- Fil