From brad.knowles at skynet.be Wed Oct 1 16:02:11 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 1 16:06:58 2003 Subject: [Mailman-Developers] MailMan Archives and more cPanel Control In-Reply-To: <003501c38795$ee5bb410$1c0d1f0a@us.oracle.com> References: <003501c38795$ee5bb410$1c0d1f0a@us.oracle.com> Message-ID: At 5:00 PM -0400 2003/09/30, Gregory A. Clark wrote: > Is there any development initiative to allow more control of the > MailMail archiving via cPanel? Ask the cPanel people. This is a commercial tool that they have grafted onto mailman, and so far they don't seem to have demonstrated any interest in getting any help from any mailman-knowledgeable people in fixing their problems. > Is there any current way I can suppress > these, given my obvious lack of system privileges? Not likely. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From bernhard at intevation.de Thu Oct 2 03:22:19 2003 From: bernhard at intevation.de (Bernhard Reiter) Date: Thu Oct 2 03:22:27 2003 Subject: [Mailman-Developers] Bug priorities (for 2.1.3)? Message-ID: <20031002072219.GA7633@intevation.de> Congratulation for getting 2.1.3 out of the door! I have a question and a remark regarding the handling of bugs: * What attributes do bugs need to get into the next release when a fix is available. As announced on the 14th of August to this list I fixed a priority 7 bug and did not receive feedback on it. Is there a special reason why it wasn't included in the 2.1.3 release? [ 665732 ] List-Id should be one line. * There is a very annoying bug (reproduced with 2.1.2, 2.1.3 test scheduled) in environments where people sign their emails, because Mailman breaks the signature. [ 815297 ] Breaking signatures in message/rfc822 attachement! http://sourceforge.net/tracker/?func=detail&aid=815297&group_id=103&atid=100103 I suggest raising the priority of this bug, because it might give Mailman a bad reputation with security aware people. And once established such an image is hard to loose again. Cheers, Bernhard -- Professional Service for Free Software (intevation.net) The FreeGIS Project (freegis.org) Association for a Free Informational Infrastructure (ffii.org) FSF Europe (fsfeurope.org) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031002/ccedf8eb/attachment.bin From barry at python.org Thu Oct 2 07:50:16 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 2 07:50:21 2003 Subject: [Mailman-Developers] Bug priorities (for 2.1.3)? In-Reply-To: <20031002072219.GA7633@intevation.de> References: <20031002072219.GA7633@intevation.de> Message-ID: <1065095416.21561.90.camel@anthem> On Thu, 2003-10-02 at 03:22, Bernhard Reiter wrote: > Congratulation for getting 2.1.3 out of the door! Thanks. Please understand that I no longer get to work on Mailman as part of my day job, so I fit it in between everything else that's going on. More than anything else, that drives what gets into patch releases and what doesn't. As I've mentioned before, I would like to find more time to work on the new version (be that 2.2 or 3.0 or both). I would love to find some one or a small group of people who would be willing and able to more or less own the 2.1 maintenance branch. Barring that, a rich uncle, winning lottery ticket, or a slightly nuts VC with bulging pockets who likes to smoke $100 bills would probably help things too. :) > * What attributes do bugs need to get into the next release > when a fix is available. As announced on the 14th of August to this > list I fixed a priority 7 bug and did not receive feedback on it. > Is there a special reason why it wasn't included in the 2.1.3 release? Lack of time is the only answer, and it's not a good one. The deal is that there were enough fixes in CVS, and enough time had gone by since 2.1.2 that I felt a new release was warranted. -Barry From bob at nleaudio.com Thu Oct 2 09:47:54 2003 From: bob at nleaudio.com (Bob Puff@NLE) Date: Thu Oct 2 09:47:58 2003 Subject: [Mailman-Developers] Bug priorities (for 2.1.3)? In-Reply-To: <20031002072219.GA7633@intevation.de> References: <20031002072219.GA7633@intevation.de> Message-ID: <20031002134754.M21639@nleaudio.com> Where is your fix for this bug? ---------- Original Message ----------- From: Bernhard Reiter To: mailman-developers@python.org Sent: Thu, 2 Oct 2003 09:22:19 +0200 Subject: [Mailman-Developers] Bug priorities (for 2.1.3)? > Congratulation for getting 2.1.3 out of the door! > > I have a question and a remark regarding the handling of bugs: > > * What attributes do bugs need to get into the next release > when a fix is available. As announced on the 14th of August to this > list I fixed a priority 7 bug and did not receive feedback on it. > Is there a special reason why it wasn't included in the 2.1.3 release? > [ 665732 ] List-Id should be one line. > > * There is a very annoying bug (reproduced with 2.1.2, 2.1.3 test > scheduled) in environments where people sign their emails, > because Mailman breaks the signature. > > [ 815297 ] Breaking signatures in message/rfc822 attachement! > http://sourceforge.net/tracker/?func=detail&aid=815297&group_id=103&atid=100103 > > I suggest raising the priority of this bug, > because it might give Mailman a bad reputation with > security aware people. > And once established such an image is hard to loose again. > > Cheers, > Bernhard > > -- > Professional Service for Free Software > (intevation.net) The FreeGIS Project > (freegis.org) Association for a Free Informational > Infrastructure (ffii.org) FSF Europe > (fsfeurope.org) ------- End of Original Message ------- From bernhard at intevation.de Thu Oct 2 11:48:34 2003 From: bernhard at intevation.de (Bernhard Reiter) Date: Thu Oct 2 11:48:40 2003 Subject: [Mailman-Developers] Bug priorities (for 2.1.3)? In-Reply-To: <20031002134754.M21639@nleaudio.com> References: <20031002072219.GA7633@intevation.de> <20031002134754.M21639@nleaudio.com> Message-ID: <20031002154834.GG14976@intevation.de> On Thu, Oct 02, 2003 at 09:47:54AM -0400, Bob Puff@NLE wrote: > Where is your fix for this bug? I've mentioned two bugs in my email you've quoted below. One [ 665732 ] is fixed, the other [ 815297 ] is not. The patch for the former is attached to the bug report. (Can't verify as sf is currently down for maintenance.) > ---------- Original Message ----------- > From: Bernhard Reiter > To: mailman-developers@python.org > Sent: Thu, 2 Oct 2003 09:22:19 +0200 > Subject: [Mailman-Developers] Bug priorities (for 2.1.3)? > > I have a question and a remark regarding the handling of bugs: > > > > * What attributes do bugs need to get into the next release > > when a fix is available. As announced on the 14th of August to this > > list I fixed a priority 7 bug and did not receive feedback on it. > > Is there a special reason why it wasn't included in the 2.1.3 release? > > [ 665732 ] List-Id should be one line. > > > > * There is a very annoying bug (reproduced with 2.1.2, 2.1.3 test > > scheduled) in environments where people sign their emails, > > because Mailman breaks the signature. > > > > [ 815297 ] Breaking signatures in message/rfc822 attachement! > > http://sourceforge.net/tracker/?func=detail&aid=815297&group_id=103&atid=100103 > > > > I suggest raising the priority of this bug, > > because it might give Mailman a bad reputation with > > security aware people. > > And once established such an image is hard to loose again. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031002/176f7210/attachment.bin From bernhard at intevation.de Thu Oct 2 11:53:01 2003 From: bernhard at intevation.de (Bernhard Reiter) Date: Thu Oct 2 11:53:06 2003 Subject: [Mailman-Developers] Bug priorities (for 2.1.3)? In-Reply-To: <1065095416.21561.90.camel@anthem> References: <20031002072219.GA7633@intevation.de> <1065095416.21561.90.camel@anthem> Message-ID: <20031002155301.GH14976@intevation.de> On Thu, Oct 02, 2003 at 07:50:16AM -0400, Barry Warsaw wrote: > On Thu, 2003-10-02 at 03:22, Bernhard Reiter wrote: > Please understand that I no longer get to work on Mailman as > part of my day job, so I fit it in between everything else that's going > on. More than anything else, that drives what gets into patch releases > and what doesn't. That is very understandable. I didn't know that your situation has changed in that regard. Maybe it should be noted more clearly on the Mailman and your personal pages. Somebody could get the impression that it is your job to do Mailman from your personal page. > As I've mentioned before, I would like to find more time to work on the > new version (be that 2.2 or 3.0 or both). I would love to find some one > or a small group of people who would be willing and able to more or less > own the 2.1 maintenance branch. That should also be annouced more clearly I believe, maybe somebody will come up then. As time tells I've done my small bit to help Mailman development, but naturally I'm in no position to take over the stable maintenance. > > * What attributes do bugs need to get into the next release > > when a fix is available. As announced on the 14th of August to this > > list I fixed a priority 7 bug and did not receive feedback on it. > > Is there a special reason why it wasn't included in the 2.1.3 release? > > Lack of time is the only answer, and it's not a good one. The deal is > that there were enough fixes in CVS, and enough time had gone by since > 2.1.2 that I felt a new release was warranted. Yes, it is always better to release what you have as opposed to not release. I hope that those two bugs are scheduled for the next release then... :) Bernhard -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031002/51813b1f/attachment-0001.bin From mss at mawhrin.net Mon Oct 6 14:25:02 2003 From: mss at mawhrin.net (Mikhail Sobolev) Date: Mon Oct 6 14:25:06 2003 Subject: [Mailman-Developers] Strange behaviour Message-ID: <20031006182502.GA6383@mawhrin.net> One of the Russian speaking users reported that the welcome message he received, is badly formatted. The subscribeack.txt file contains: In the resulting message it looks like So it reformats the text, and does not put a space where necessary. The reported behaviour is for mailman 2.1.2. I wonder whether this was already reported? -- Misha -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031006/7ebf7a95/attachment.bin From brad.knowles at skynet.be Mon Oct 6 14:46:20 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Mon Oct 6 14:57:04 2003 Subject: [Mailman-Developers] Strange behaviour In-Reply-To: <20031006182502.GA6383@mawhrin.net> References: <20031006182502.GA6383@mawhrin.net> Message-ID: At 7:25 PM +0100 2003/10/06, Mikhail Sobolev wrote: > So it reformats the text, and does not put a space where necessary. The > reported behaviour is for mailman 2.1.2. I wonder whether this was > already reported? I've seen the same sort of thing with a modified postheld.txt or refuse.txt. If you put leading white space at the beginning of the next line, I have found that the system will not re-wrap your text. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From mss at mawhrin.net Mon Oct 6 15:01:18 2003 From: mss at mawhrin.net (Mikhail Sobolev) Date: Mon Oct 6 15:01:22 2003 Subject: [Mailman-Developers] Strange behaviour In-Reply-To: References: <20031006182502.GA6383@mawhrin.net> Message-ID: <20031006190118.GA6744@mawhrin.net> On Mon, Oct 06, 2003 at 08:46:20PM +0200, Brad Knowles wrote: > > So it reformats the text, and does not put a space where necessary. The > > reported behaviour is for mailman 2.1.2. I wonder whether this was > > already reported? > > I've seen the same sort of thing with a modified postheld.txt or > refuse.txt. If you put leading white space at the beginning of the > next line, I have found that the system will not re-wrap your text. This sounds like a hack. And that means that the format of the translated files is going to differ from the english templates, which, to my mind, is not very good. Thanks for the idea though. Barry, probably, it's a bug in Utils.py/wrap function? Or it is an expected behaviour? -- Misha -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031006/ac7c04e6/attachment.bin From pioppo at ferrara.linux.it Mon Oct 6 15:50:23 2003 From: pioppo at ferrara.linux.it (Simone Piunno) Date: Mon Oct 6 15:42:11 2003 Subject: [Mailman-Developers] Strange behaviour In-Reply-To: <20031006190118.GA6744@mawhrin.net> References: <20031006182502.GA6383@mawhrin.net> <20031006190118.GA6744@mawhrin.net> Message-ID: <200310062150.23648.pioppo@ferrara.linux.it> On Monday 06 October 2003 21:01, Mikhail Sobolev wrote: > > > So it reformats the text, and does not put a space where necessary. > Barry, probably, it's a bug in Utils.py/wrap function? Or it is an > expected behaviour? Actually reflowing the text is expected behaviour (e.g. a feature instead of a bug) but there should be a space. Also, not reflowing rows beginning with one or more spaces is another expected behaviour (so that it's easy to mark preformatted text). In my experience as a translator, missing spaces always happened because I forgot that space while translating or because even the original english text had the problem. So, if you can see the problem only happens in russian, I'd start searching for that string in the relevant .po file and verifying everything is ok. -- Adde parvum parvo magnus acervus erit -- Ovidio From mss at mawhrin.net Mon Oct 6 15:48:10 2003 From: mss at mawhrin.net (Mikhail Sobolev) Date: Mon Oct 6 15:48:15 2003 Subject: [Mailman-Developers] Strange behaviour In-Reply-To: <200310062150.23648.pioppo@ferrara.linux.it> References: <20031006182502.GA6383@mawhrin.net> <20031006190118.GA6744@mawhrin.net> <200310062150.23648.pioppo@ferrara.linux.it> Message-ID: <20031006194810.GA7000@mawhrin.net> On Mon, Oct 06, 2003 at 09:50:23PM +0200, Simone Piunno wrote: > > > > So it reformats the text, and does not put a space where necessary. > > > Barry, probably, it's a bug in Utils.py/wrap function? Or it is an > > expected behaviour? > > Actually reflowing the text is expected behaviour (e.g. a feature instead of a > bug) but there should be a space. Also, not reflowing rows beginning with > one or more spaces is another expected behaviour (so that it's easy to mark > preformatted text). That's what I gathered while reading the source for wrap function. > In my experience as a translator, missing spaces always happened because I > forgot that space while translating or because even the original english text > had the problem. So, if you can see the problem only happens in russian, I'd > start searching for that string in the relevant .po file and verifying > everything is ok. It's not in a .po file. It's templates/ru/subscribeack.txt file (and probably others). As for the missing spaces, as I said while reformatting the newline was not replaced with a space, while this seems to be expected. -- Misha (who seeks some forces to understand how wrap function works) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031006/cf129ed2/attachment.bin From barry at python.org Tue Oct 7 17:23:29 2003 From: barry at python.org (Barry Warsaw) Date: Tue Oct 7 17:23:36 2003 Subject: [Mailman-Developers] Strange behaviour In-Reply-To: <20031006194810.GA7000@mawhrin.net> References: <20031006182502.GA6383@mawhrin.net> <20031006190118.GA6744@mawhrin.net> <200310062150.23648.pioppo@ferrara.linux.it> <20031006194810.GA7000@mawhrin.net> Message-ID: <1065561808.18519.3.camel@anthem> On Mon, 2003-10-06 at 15:48, Mikhail Sobolev wrote: > > In my experience as a translator, missing spaces always happened because I > > forgot that space while translating or because even the original english text > > had the problem. So, if you can see the problem only happens in russian, I'd > > start searching for that string in the relevant .po file and verifying > > everything is ok. > It's not in a .po file. It's templates/ru/subscribeack.txt file (and > probably others). > > As for the missing spaces, as I said while reformatting the newline was > not replaced with a space, while this seems to be expected. As Simone says, I think what you're seeing is expected behavior from wrap(). And newlines are supposed to be replaced by spaces, as in: Python 2.3.2 (#1, Oct 3 2003, 08:18:26) [GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from Mailman.Utils import wrap >>> wrap("""This is some leading text ... that will get wrapped ... and reflowed. ... """) 'This is some leading text that will get wrapped and reflowed.\n' -Barry From mss at mawhrin.net Tue Oct 7 17:47:41 2003 From: mss at mawhrin.net (Mikhail Sobolev) Date: Tue Oct 7 17:47:46 2003 Subject: [Mailman-Developers] Strange behaviour In-Reply-To: <1065561808.18519.3.camel@anthem> References: <20031006182502.GA6383@mawhrin.net> <20031006190118.GA6744@mawhrin.net> <200310062150.23648.pioppo@ferrara.linux.it> <20031006194810.GA7000@mawhrin.net> <1065561808.18519.3.camel@anthem> Message-ID: <20031007214741.GA13614@mawhrin.net> On Tue, Oct 07, 2003 at 05:23:29PM -0400, Barry Warsaw wrote: > On Mon, 2003-10-06 at 15:48, Mikhail Sobolev wrote: > > > > In my experience as a translator, missing spaces always happened because I > > > forgot that space while translating or because even the original english text > > > had the problem. So, if you can see the problem only happens in russian, I'd > > > start searching for that string in the relevant .po file and verifying > > > everything is ok. > > It's not in a .po file. It's templates/ru/subscribeack.txt file (and > > probably others). > > > > As for the missing spaces, as I said while reformatting the newline was > > not replaced with a space, while this seems to be expected. > > As Simone says, I think what you're seeing is expected behavior from > wrap(). And newlines are supposed to be replaced by spaces, as in: I am sorry, but I do not really see this as an expected behaviour. As I said the newlines are replaced with nothing. I have the text: Blah blah blah oops oops oops And after the reformatting I expect: Blah blah blah oops oops oops while I get Blah blah blahoops oops oops -- Misha -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031007/0f13d21a/attachment.bin From barry at python.org Tue Oct 7 17:55:45 2003 From: barry at python.org (Barry Warsaw) Date: Tue Oct 7 17:55:57 2003 Subject: [Mailman-Developers] Strange behaviour In-Reply-To: <20031007214741.GA13614@mawhrin.net> References: <20031006182502.GA6383@mawhrin.net> <20031006190118.GA6744@mawhrin.net> <200310062150.23648.pioppo@ferrara.linux.it> <20031006194810.GA7000@mawhrin.net> <1065561808.18519.3.camel@anthem> <20031007214741.GA13614@mawhrin.net> Message-ID: <1065563745.18519.20.camel@anthem> On Tue, 2003-10-07 at 17:47, Mikhail Sobolev wrote: > I am sorry, but I do not really see this as an expected behaviour. As I > said the newlines are replaced with nothing. > > I have the text: > > Blah blah blah > oops oops oops > > And after the reformatting I expect: > > Blah blah blah oops oops oops > > while I get > > Blah blah blahoops oops oops Yes, the fact that the newline is getting eaten is not expected behavior. Here's a sample session; you can try to reproduce this with some real text by cd'ing into /usr/local/mailman and starting python like so: % cd /usr/local/mailman % python Python 2.3.2 (#1, Oct 3 2003, 08:18:26) [GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from Mailman.Utils import wrap >>> wrap("""\ ... blah blah blah ... oops oops oops ... """) 'blah blah blah oops oops oops\n' What do you get? -Barry From mss at mawhrin.net Tue Oct 7 18:23:23 2003 From: mss at mawhrin.net (Mikhail Sobolev) Date: Tue Oct 7 18:23:27 2003 Subject: [Mailman-Developers] Strange behaviour In-Reply-To: <1065563745.18519.20.camel@anthem> References: <20031006182502.GA6383@mawhrin.net> <20031006190118.GA6744@mawhrin.net> <200310062150.23648.pioppo@ferrara.linux.it> <20031006194810.GA7000@mawhrin.net> <1065561808.18519.3.camel@anthem> <20031007214741.GA13614@mawhrin.net> <1065563745.18519.20.camel@anthem> Message-ID: <20031007222323.GA13931@mawhrin.net> Barry, I get the correct information. I even tried to remove myself from the mailing list and subscribe back (as the original message suggested it was the list hosted on my computer), however I could not get that errornous message. I'll investigate better and return with better understanding what is wrong and when it happens. -- Misha On Tue, Oct 07, 2003 at 05:55:45PM -0400, Barry Warsaw wrote: > On Tue, 2003-10-07 at 17:47, Mikhail Sobolev wrote: > > > I am sorry, but I do not really see this as an expected behaviour. As I > > said the newlines are replaced with nothing. > > > > I have the text: > > > > Blah blah blah > > oops oops oops > > > > And after the reformatting I expect: > > > > Blah blah blah oops oops oops > > > > while I get > > > > Blah blah blahoops oops oops > > Yes, the fact that the newline is getting eaten is not expected > behavior. Here's a sample session; you can try to reproduce this with > some real text by cd'ing into /usr/local/mailman and starting python > like so: > > % cd /usr/local/mailman > % python > Python 2.3.2 (#1, Oct 3 2003, 08:18:26) > [GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-5)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> from Mailman.Utils import wrap > >>> wrap("""\ > ... blah blah blah > ... oops oops oops > ... """) > 'blah blah blah oops oops oops\n' > > What do you get? > -Barry > > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031007/b64a782f/attachment.bin From barry at python.org Tue Oct 7 18:28:24 2003 From: barry at python.org (Barry Warsaw) Date: Tue Oct 7 18:28:31 2003 Subject: [Mailman-Developers] Strange behaviour In-Reply-To: <20031007222323.GA13931@mawhrin.net> References: <20031006182502.GA6383@mawhrin.net> <20031006190118.GA6744@mawhrin.net> <200310062150.23648.pioppo@ferrara.linux.it> <20031006194810.GA7000@mawhrin.net> <1065561808.18519.3.camel@anthem> <20031007214741.GA13614@mawhrin.net> <1065563745.18519.20.camel@anthem> <20031007222323.GA13931@mawhrin.net> Message-ID: <1065565703.18519.24.camel@anthem> On Tue, 2003-10-07 at 18:23, Mikhail Sobolev wrote: > Barry, > > I get the correct information. I even tried to remove myself from the > mailing list and subscribe back (as the original message suggested it > was the list hosted on my computer), however I could not get that > errornous message. I'll investigate better and return with better > understanding what is wrong and when it happens. > > -- > Misha Okay, thanks Misha. -Barry From barry at python.org Tue Oct 7 18:36:47 2003 From: barry at python.org (Barry Warsaw) Date: Tue Oct 7 18:46:16 2003 Subject: [Mailman-Developers] Terri Oda's List Member Manual now available Message-ID: <1065566207.18519.28.camel@anthem> Terri Oda has written a nice manual for Mailman list members, and I've finally pushed it out to the web site (and mirrors). It's available both on-line and in PDF format. You can access the manual at: http://www.list.org/users.html http://mailman.sf.net/users.html http://www.gnu.org/software/mailman/users.html (As usual, the GNU mirror has a delay and is not yet up-to-date.) Thanks Terri! -Barry From mlucas at rice.edu Wed Oct 8 10:59:43 2003 From: mlucas at rice.edu (Mike Lucas) Date: Wed Oct 8 10:57:54 2003 Subject: [Mailman-Developers] A couple of questions. Message-ID: <3F84265F.4070601@rice.edu> I was wondering if anyone can help me with a couple questions. 1. Is it possible to turn off the email interface to mailman? We want to make them use the web interface only, if that is possible. 2. How does mailman store stateful information like auth credentials and passwords? 3. Can we send the archives to a remote machine without using NFS. Thanks in advance, Mike From barry at python.org Wed Oct 8 11:09:56 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 8 11:09:55 2003 Subject: [Mailman-Developers] A couple of questions. In-Reply-To: <3F84265F.4070601@rice.edu> References: <3F84265F.4070601@rice.edu> Message-ID: <1065625796.2060.12.camel@geddy> On Wed, 2003-10-08 at 10:59, Mike Lucas wrote: > I was wondering if anyone can help me with a couple questions. > > 1. Is it possible to turn off the email interface to mailman? We want > to make them use the web interface only, if that is possible. You have the source, so anything's possible. :) What you need to do depends on questions like: - do you want to turn off the entire email robot interface, including the leave and subscribe addresses? - do you want to turn it off for all your lists, or just some of them? There's no easy way to do this, but with a little hacking it's possible. Be careful not to turn off the bounce processor, or the -confirm address (well, maybe you want to turn that off too and just confirm via the web). Things to look at include the CommandRunner which handles all email robot commands, and the various aliases that normally get added. > 2. How does mailman store stateful information like auth credentials and > passwords? Everything's stored in a Python pickle, specifically lists/yourlist/config.pck. Admin passwords are stored as sha1 hashes, user passwords are stored in clear text (likely to change in the next release). The site password and list creator's password are stored in separate files, also hashed. > 3. Can we send the archives to a remote machine without using NFS. This isn't a supported configuration, but you can hook into the external archiver support to create your own mechanism for sending archivable messages to a remote process. See Defaults.py for details. HTH, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031008/38647c26/attachment.bin From jeffh at gryphongardens.com Wed Oct 8 13:58:27 2003 From: jeffh at gryphongardens.com (Jeff Hahn) Date: Wed Oct 8 13:58:17 2003 Subject: [Mailman-Developers] FW: [Mailman-Users] DirecPC, MailShield and VERP Message-ID: <005001c38dc5$c66cf910$14cca8c0@internal.gryphongardens.com> I'm still trying to figure a way around this problem... Any ideas on a way to say "don't VERP this address"? Thanks! -Jeff -----Original Message----- In their infinite wisdom, DirecWay/DirecPC appear to be using a program that rejects VERP'd from addresses. The non-VERP'd messages are delivered by exim... 2003-10-03 08:38:38 1A5Q8F-0005zh-00 => xxxxxxxx@direcway.com R=lookuphost T=remote_smtp H=mx1.direcpc.com [66.82.4.72] C="250 2.5.0 Ok. (relayed by MailShield)" The VERP'd messages are rejected... 2003-10-03 08:35:43 1A5Q5u-0005vu-02 ** xxxxxxxx@direcway.com R=lookuphost T=remote_smtp: SMTP error from remote mailer after MAIL FROM: SIZE=4682: host mx1.direcpc.com [66.82.4.72]: 550 SMTP session aborted; 2 Is there anyway to disable VERP (which has proved very useful in handling bounces) on a per recipient or per list basis???? ------------------------------------------------------ From jarrell at vt.edu Wed Oct 8 17:04:42 2003 From: jarrell at vt.edu (Ron Jarrell) Date: Wed Oct 8 17:05:39 2003 Subject: [Mailman-Developers] msgmerge Message-ID: <5.2.1.1.2.20031008170058.028b3e90@lennier.cc.vt.edu> The comment in the makefile that says something like "You shouldn't have to do this, but if you need to rebuild the catalogs you'll need gnu make and msgmerge from the gnu gettext" probably ought to include a version number. I *did* have gettext; it was part of the big solaris 9 gnu utils disk (which saves an awful lot of time of installing stuff :-)). It was, however, 0.10.35, and that version doesn't have the -U option. I downloaded 0.12.1 and built and installed it, and *that* version does... Now I can go back to running make on my cvs update :-) From r.barrett at openinfo.co.uk Wed Oct 8 18:38:29 2003 From: r.barrett at openinfo.co.uk (Richard Barrett) Date: Wed Oct 8 18:38:38 2003 Subject: [Mailman-Developers] FW: [Mailman-Users] DirecPC, MailShield and VERP In-Reply-To: <005001c38dc5$c66cf910$14cca8c0@internal.gryphongardens.com> Message-ID: <23D5D7A6-F9E0-11D7-99F5-000A957C9A50@openinfo.co.uk> On Wednesday, October 8, 2003, at 06:58 pm, Jeff Hahn wrote: > I'm still trying to figure a way around this problem... Any ideas on > a way > to say "don't VERP this address"? > Maybe if you restrict VERP'ing to personalised deliveries (in mm_cfg.py) and do not personalise mail for a list with problem subscriber addresses then the list's outgoing messages will not be VERP'ed and hence rejected. Pity to lose the benefits of personalization and VERP but it may the only solution if you cannot persuade the destination mail domain's postmaster to fix his MTA/Spam filter. > Thanks! > > -Jeff > > -----Original Message----- > > In their infinite wisdom, DirecWay/DirecPC appear to be using a > program that > rejects VERP'd from addresses. > A VERP'ed address is just a valid mail alias @ a mail domain. It is either a bug or misconfiguration that causes refusal to accept a perfectly valid mail address as the envelope return path or in the From: header. From the non-VERP'ed message it looks as though the problem may be with something called MailShield ??maybe this being the Lyris spam filter product - all X,000 dollar license of it??. The whole point of a VERP'ed return address is that, because it is just a valid email address, it only has to be 'understood' and 'decoded' by the originating system's MTA if it is returned. For all the MTAs between originator and recipient's MUA, the presence/absence of VERP'ing is irrelevant and not their concern. What appears to be causing your problem is a badly implemented/configured spam filter which is applying an invalid policy to reject you messages; I suspect it may be some rule about the length of the mail alias in the return path or, heaven forfend, the From: header. The VERP'ed alias may be longer than some arbitrary limit being applied. > The non-VERP'd messages are delivered by exim... > > 2003-10-03 08:38:38 1A5Q8F-0005zh-00 => xxxxxxxx@direcway.com > R=lookuphost > T=remote_smtp H=mx1.direcpc.com [66.82.4.72] C="250 2.5.0 Ok. (relayed > by > MailShield)" > > The VERP'd messages are rejected... > > 2003-10-03 08:35:43 1A5Q5u-0005vu-02 ** xxxxxxxx@direcway.com > R=lookuphost > T=remote_smtp: SMTP error from remote mailer after MAIL > FROM: > SIZE=4682: host mx1.direcpc.com [66.82.4.72]: 550 SMTP session aborted; > 2 > > Is there anyway to disable VERP (which has proved very useful in > handling > bounces) on a per recipient or per list basis???? > > > ------------------------------------------------------ > ----------------------------------------------------------------------- Richard Barrett http://www.openinfo.co.uk From claw at kanga.nu Wed Oct 8 22:23:37 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 8 22:23:43 2003 Subject: [Mailman-Developers] FW: [Mailman-Users] DirecPC, MailShield and VERP In-Reply-To: Message from "Jeff Hahn" of "Wed, 08 Oct 2003 12:58:27 CDT." <005001c38dc5$c66cf910$14cca8c0@internal.gryphongardens.com> References: <005001c38dc5$c66cf910$14cca8c0@internal.gryphongardens.com> Message-ID: <12471.1065666217@kanga.nu> On Wed, 8 Oct 2003 12:58:27 -0500 Jeff Hahn wrote: > I'm still trying to figure a way around this problem... Any ideas on > a way to say "don't VERP this address"? Try changing the plus addressing character to a '-' instead of '+'. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From barry at python.org Thu Oct 9 11:24:29 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 9 11:24:35 2003 Subject: [Mailman-Developers] Re: Problem with MM after power outage In-Reply-To: <1063428914.19907.126.camel@anthem> References: <20030815145839.GM460@mrbill.net> <200308192117.38899.pioppo@ferrara.linux.it> <1061330802.1131.18.camel@geddy> <200308200045.23234.pioppo@ferrara.linux.it> <1063373997.19907.44.camel@anthem> <1063384531.19907.67.camel@anthem> <1063428914.19907.126.camel@anthem> Message-ID: <3F857DAD.8@python.org> Barry Warsaw wrote: > >> So that no hacking is required to make it something the user can >>see and modify. We'd still be doing the dangerous thing by leaving >>it set to default off (in the case of Linux), but at least we >>wouldn't be requiring that they hack the code in order to be able to >>tweak this option. > > > But, really, they have to hack the code either way. Either you're > editing the mm_cfg.py file, or you're editing the Switchboard.py file. > The former is a little more visible, since that's the file people are > trained to touch. > > But here's the thing. For a bug fix release, it seems wrong to expose > this in mm_cfg.py because that implies some higher state of blessing. > I'm not convinced that we've hit upon the ultimate right solution so I > don't want to commit to it. After folks have had a chance to test it > and see if 1) it fixes the problem, and 2) what the real world > trade-offs are, then we can decide whether it deserves higher profile, > or maybe just us choosing to hard code it to always fsync(). I've thought about this some more, and I'm going to reverse the decision not to expose SYNC_AFTER_WRITE in mm_cfg.py. Apologies for being so hard-headed about it. We have the same potential problem with the config.pck file, so I want to move the same logic into MailList.py, i.e. always flush before closing, and optionally fsync the file. That means moving the option out into Defaults.py.in. I'll do that for 2.1.4. BTW, has anybody actually turned on SYNC_AFTER_WRITE and have you 1) noticed any improvement in the robustness of the message files, and/or 2) noticed any performance degradation? -Barry From barry at python.org Thu Oct 9 11:24:29 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 9 11:24:40 2003 Subject: [Mailman-Developers] Re: Problem with MM after power outage In-Reply-To: <1063428914.19907.126.camel@anthem> References: <20030815145839.GM460@mrbill.net> <200308192117.38899.pioppo@ferrara.linux.it> <1061330802.1131.18.camel@geddy> <200308200045.23234.pioppo@ferrara.linux.it> <1063373997.19907.44.camel@anthem> <1063384531.19907.67.camel@anthem> <1063428914.19907.126.camel@anthem> Message-ID: <3F857DAD.8@python.org> Barry Warsaw wrote: > >> So that no hacking is required to make it something the user can >>see and modify. We'd still be doing the dangerous thing by leaving >>it set to default off (in the case of Linux), but at least we >>wouldn't be requiring that they hack the code in order to be able to >>tweak this option. > > > But, really, they have to hack the code either way. Either you're > editing the mm_cfg.py file, or you're editing the Switchboard.py file. > The former is a little more visible, since that's the file people are > trained to touch. > > But here's the thing. For a bug fix release, it seems wrong to expose > this in mm_cfg.py because that implies some higher state of blessing. > I'm not convinced that we've hit upon the ultimate right solution so I > don't want to commit to it. After folks have had a chance to test it > and see if 1) it fixes the problem, and 2) what the real world > trade-offs are, then we can decide whether it deserves higher profile, > or maybe just us choosing to hard code it to always fsync(). I've thought about this some more, and I'm going to reverse the decision not to expose SYNC_AFTER_WRITE in mm_cfg.py. Apologies for being so hard-headed about it. We have the same potential problem with the config.pck file, so I want to move the same logic into MailList.py, i.e. always flush before closing, and optionally fsync the file. That means moving the option out into Defaults.py.in. I'll do that for 2.1.4. BTW, has anybody actually turned on SYNC_AFTER_WRITE and have you 1) noticed any improvement in the robustness of the message files, and/or 2) noticed any performance degradation? -Barry From shaikli at yahoo.com Thu Oct 9 14:14:43 2003 From: shaikli at yahoo.com (Nadim Shaikli) Date: Thu Oct 9 14:14:48 2003 Subject: [Mailman-Developers] Log file rollovers Message-ID: <20031009181443.38413.qmail@web14915.mail.yahoo.com> I've noticed some weird happenings with Mailman (v-2.1.2). As expected on the first of each month all the log files get rolled over (so vette becomes vette.1 and bounce becomes bounce.1, etc). Every so often I see files that are created but are not used. For instance, I'm looking at my log directory and can see, -rw-rw-r-- 1 list list 0 Oct 1 06:27 post -rw-rw-r-- 1 list list 4819198 Oct 9 08:57 post.1 -rw-rw-r-- 1 list list 632841 Sep 10 07:59 post.2.gz all the current posts are still making it to 'post.1' for some reason and not simply 'post'. Mind you in September they were making it to the normal 'post' file. Is this a known problem and is there a work-around or a means to fix this sans restarts. I did notice this happening on other files as well (subscribe, error, etc) - the problem resolves itself if you restart mailman. Regards, - Nadim __________________________________ Do you Yahoo!? The New Yahoo! Shopping - with improved product search http://shopping.yahoo.com From aaraines at pobox.com Thu Oct 9 14:24:17 2003 From: aaraines at pobox.com (Andrew A. Raines) Date: Thu Oct 9 14:30:16 2003 Subject: [Mailman-Developers] Multiple dicts containing subscribers within MailList pickle? Message-ID: Greetings, I'm new to Mailman development, so bear with me. I didn't see anything in the archives addressing this, but my searches weren't too complex. There's probably good reason for this design, but why have multiple dicts within the MailList infrastructure duplicating the subscriber list? In [config.pck]: 'passwords': { 'sub1@foo.dom': 'pass', 'sub2@foo.dom': 'pass' } 'members': { 'sub1@foo.dom': 0, 'sub2@foo.dom': 0 } 'usernames': { 'sub1@foo.dom': u'Subscriber One!', 'sub2@foo.dom': u'Subscriber Two!' } 'user_options: { 'sub1@foo.dom': 264, 'sub2@foo.dom': 392 } Seems like it would be more straightforward (in design, at least; maybe not in practice) to store it with nested dicts like: 'subscribers': { 'sub1@foo.dom': { 'password': 'pass', 'digest': 0, 'username': u'Subscriber One!', 'options': 264 } 'sub2@foo.dom': { 'password': 'pass', 'digest': 1, 'username': u'Subscriber Two!', 'options': 392 } } There tends to be more consistency with this approach than, for instance, having a members dict *and* a digest_members dict which could each be empty. Why not just toggle a subvalue within a master subscriber database dict? Thanks, -Drew From pioppo at ferrara.linux.it Thu Oct 9 14:39:54 2003 From: pioppo at ferrara.linux.it (Simone Piunno) Date: Thu Oct 9 14:31:26 2003 Subject: [Mailman-Developers] Re: Problem with MM after power outage In-Reply-To: <3F857DAD.8@python.org> References: <20030815145839.GM460@mrbill.net> <1063428914.19907.126.camel@anthem> <3F857DAD.8@python.org> Message-ID: <200310092039.54249.pioppo@ferrara.linux.it> On Thursday 09 October 2003 17:24, Barry Warsaw wrote: > I've thought about this some more, and I'm going to reverse the decision > not to expose SYNC_AFTER_WRITE in mm_cfg.py. Apologies for being so > hard-headed about it. While you're at it, I would expose STEALTH_MODE too (in scripts/driver), and I'd switch to Yes as default value. -- Adde parvum parvo magnus acervus erit -- Ovidio From jdennis at redhat.com Thu Oct 9 14:47:33 2003 From: jdennis at redhat.com (John Dennis) Date: Thu Oct 9 14:47:39 2003 Subject: [Mailman-Developers] Log file rollovers In-Reply-To: <20031009181443.38413.qmail@web14915.mail.yahoo.com> References: <20031009181443.38413.qmail@web14915.mail.yahoo.com> Message-ID: <1065725252.28958.325.camel@finch.boston.redhat.com> You didn't say on what type and version the system is, which is important because things like log file rollovers aren't really part of mailman proper but rather are done as part of the system specific installation. I do know that for a while we (Red Hat) had a bug with log file rotation in our RPM, that was fixed about 6 months ago. However the bug I'm aware of does not seem like what you are describing. The bug we fixed would keep generating new files with an extra digit appended to it till it eventually filled the file system, e.g. the file log would become log.1, then log.1.1, then log.1.1.1 rather than log.2. Exact details may vary as I'm going from memory. Best thing for you to do is let us know the exact type of system, the contents of /etc/logrotate.conf and /etc/logrotate.d/mailman (assuming you have a system with this version of logrotate. HTH, John On Thu, 2003-10-09 at 14:14, Nadim Shaikli wrote: > I've noticed some weird happenings with Mailman (v-2.1.2). As expected > on the first of each month all the log files get rolled over (so vette > becomes vette.1 and bounce becomes bounce.1, etc). Every so often I see > files that are created but are not used. For instance, I'm looking at > my log directory and can see, > > -rw-rw-r-- 1 list list 0 Oct 1 06:27 post > -rw-rw-r-- 1 list list 4819198 Oct 9 08:57 post.1 > -rw-rw-r-- 1 list list 632841 Sep 10 07:59 post.2.gz > > all the current posts are still making it to 'post.1' for some reason > and not simply 'post'. Mind you in September they were making it to > the normal 'post' file. > > Is this a known problem and is there a work-around or a means to fix > this sans restarts. I did notice this happening on other files as > well (subscribe, error, etc) - the problem resolves itself if you > restart mailman. > > Regards, > > - Nadim > > > __________________________________ > Do you Yahoo!? > The New Yahoo! Shopping - with improved product search > http://shopping.yahoo.com > > _______________________________________________ > Mailman-Developers mailing list > Mailman-Developers@python.org > http://mail.python.org/mailman/listinfo/mailman-developers -- John Dennis From barry at python.org Thu Oct 9 14:49:38 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 9 14:49:44 2003 Subject: [Mailman-Developers] Multiple dicts containing subscribers within MailList pickle? In-Reply-To: References: Message-ID: <1065725377.21979.21.camel@anthem> On Thu, 2003-10-09 at 14:24, Andrew A. Raines wrote: > There's probably good reason for this design, but why have > multiple dicts within the MailList infrastructure duplicating > the subscriber list? It's completely historical and indicates the accretive nature of the user database. E.g. we didn't grow a usernames dict until 2.1.somebeta. It's generally not too wasteful though, given that most of the dicts don't have keys if the member either has no value or a default value. Having said that, eventually the member stuff will be ripped out of the list's data structures, but that's a Mailman 3.0 thing. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031009/eb5233c6/attachment.bin From barry at python.org Thu Oct 9 14:50:19 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 9 14:50:25 2003 Subject: [Mailman-Developers] Re: Problem with MM after power outage In-Reply-To: <200310092039.54249.pioppo@ferrara.linux.it> References: <20030815145839.GM460@mrbill.net> <1063428914.19907.126.camel@anthem> <3F857DAD.8@python.org> <200310092039.54249.pioppo@ferrara.linux.it> Message-ID: <1065725418.21979.23.camel@anthem> On Thu, 2003-10-09 at 14:39, Simone Piunno wrote: > On Thursday 09 October 2003 17:24, Barry Warsaw wrote: > > > I've thought about this some more, and I'm going to reverse the decision > > not to expose SYNC_AFTER_WRITE in mm_cfg.py. Apologies for being so > > hard-headed about it. > > While you're at it, I would expose STEALTH_MODE too (in scripts/driver), and > I'd switch to Yes as default value. Ok. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031009/d5e23091/attachment.bin From trevor.wendt at aquila.com Thu Oct 2 16:25:30 2003 From: trevor.wendt at aquila.com (Wendt, Trevor) Date: Thu Oct 9 14:59:39 2003 Subject: [Mailman-Developers] HTML Emails Message-ID: I'm looking at using Mailman to distribute a multipart Text / HTML email message. The Text message is displayed for Text only email viewers and the HTML message is displayed for HTML capable email viewers. I'm using a separate CGI script to create the multipart email message which is then sent to my distribution list. This process works great. My question is this; When using the setup outlined above, are there any tokens (such as %(users)s) that I can use in the HTML email message content so when it is passed through Mailman, Mailman would then replace the token with a user's name, email address, or append the full url for the customer's options page? This functionality exists in Mailman because all the email templates work on this method. I guess what I need to know is if and how one can tap into this functionality with pass through emails. Any suggestions would be appreciated. Thanks, Trevor From chris at apex-internet.com Fri Oct 3 13:58:04 2003 From: chris at apex-internet.com (Chris Szilagyi) Date: Thu Oct 9 14:59:41 2003 Subject: [Mailman-Developers] maximum limit of addresses per list Message-ID: Hello, We use Mailman 2.1.2 and I am curious if there is a way to set a maximum number of addresses that are allowed in each individual list. I have looked through the documentation and have not seen anything regarding a limit like this. Or, if this is currently not supported in Mailman, has anybody found a workaround to handle something like this, or are there any provisions for this feature in any future releases of Mailman?? Thank you for all feedback. -- Chris From arnaud.desmons at epitech.net Fri Oct 3 16:59:10 2003 From: arnaud.desmons at epitech.net (Logarno) Date: Thu Oct 9 14:59:43 2003 Subject: [Mailman-Developers] Digested articles probleme Message-ID: <85e3cea3un5.fsf@epitech.net> Hi, When I post digested article from gnus on a mailman mailing list I get this error in : ---[/var/spool/mailman/logs/error]--- Oct 03 22:16:06 2003 (30348) Uncaught runner exception: 'list' object has no attribute 'splitlines' Oct 03 22:16:06 2003 (30348) Traceback (most recent call last): File "/usr/obj/i386/ports/mailman-2.1.1/fake-i386/usr/local/lib/mailman/Mailman/Queue/Runner.py", line 105, in _oneloop File "/usr/obj/i386/ports/mailman-2.1.1/fake-i386/usr/local/lib/mailman/Mailman/Queue/Runner.py", line 155, in _onefile File "/usr/obj/i386/ports/mailman-2.1.1/fake-i386/usr/local/lib/mailman/Mailman/Queue/IncomingRunner.py", line 130, in _dispose File "/usr/obj/i386/ports/mailman-2.1.1/fake-i386/usr/local/lib/mailman/Mailman/Queue/IncomingRunner.py", line 153, in _dopipeline File "/usr/obj/i386/ports/mailman-2.1.1/fake-i386/usr/local/lib/mailman/Mailman/Handlers/Approve.py", line 56, in process AttributeError: 'list' object has no attribute 'splitlines' Oct 03 22:16:06 2003 (30348) SHUNTING: 1065212165.167116+096f8fb0244c4a14736d6ca8d18caab933908b1d ---[/var/log/mailog]--- dytype=8BITMIME, proto=ESMTP, daemon=MTA, relay=hermes.toto.fr [42.42.42.42] Oct 3 22:48:21 parmesan sm-mta[31529]: h93KmHIW017713: to="|/usr/local/lib/mailman/mail/mailman post test", ctladdr= (1/0), delay=00:00:03, xdelay=00:00:03, mailer=prog, pri=36487, dsn=2.0.0, stat=Sent ... And nothing else after as nothing come back to suscribers... Normal article (non digested) works fine. I use python-2.2.1p1 and mailman-2.1.1 (using ports) on OpenBSD 3.3 with Sendmail. Thanks a lot ! -- Arnaud Desmons From andy at nospam.com Fri Oct 3 23:59:49 2003 From: andy at nospam.com (Andy Sy) Date: Thu Oct 9 14:59:46 2003 Subject: [Mailman-Developers] rsync'able Mailman archive Message-ID: <3F7E45B5.7040708@nospam.com> For a huge list like http://www.libsdl.org/pipermail/sdl/ (104MB), being able to download the raw archive is a blessing. For those newsgroups which have (hallelujah) Mailman versions, this means no more fiddling with infernal newsreader features. Just: 1. Fetch huge archive efficiently your favorite download client. 2. Drop the mbox format file into your mail client directory (like Mozilla Mail) 3. voila! A configurable, collapsible-threadable, searchable, completely complete (yeah!) local version of the whole list. PROBLEM ======= is... how do you update your local mbox copy efficiently? I, can, for example, get a snapshot of the SDL archives as it stands today, subscribe to the mailing list, and basically ensure my local 'mirror' is both complete and up-to-date. BUT... what if I want to stop subscribing for a while, and then 3 months down the road I want a complete version of the mailing list again? I would then have to download the (now larger) full archive all over again. The solution is to make the archive downloads rsync'able. Which brings me to the topic of this post: QUESTION 1 ========== Is making the archive rsync'able a responsibility of the list administrator or... wouldn't it be better to build such support in Mailman (to allow people who are OC about getting the complete list to save time and avoid wasting gobs of bandwidth)? QUESTION 2 ========== Once I start subscribing to the list, my local mbox-format copy of the list is now being updated locally and thus will not be byte-for-byte identical to the Mailman archive. If I then rsync said mailbox with Mailman's, will the differences then be 'repaired' efficiently (like rsync is supposed to do)? A HACKISH WORKAROUND IN THE ABSENCE OF RSYNC SUPPORT ==================================================== Deliberately truncate your local mbox copy to a size smaller than it was when you FIRST updated it locally (you have to remember what its size was), and then resume copy transferring from that point. -- ========================================= reply-to: a n d y @ n e t f x p h . c o m http://www.neotitans.com Web and Software Development From andy at nospam.com Sun Oct 5 05:10:17 2003 From: andy at nospam.com (Andy Sy) Date: Thu Oct 9 14:59:47 2003 Subject: [Mailman-Developers] Retrieving individual messages from raw Mailman mboxes via http Message-ID: <3F7FDFF9.2090206@nospam.com> I am thinking of adapting a collapsible, outlinable, no-page refresh, on-demand message-body load browser-based message thread interface I made (see http://www.neotitans.com/page.gif) to work with GNU Mailman (among other things) lists. Ideally, I would like it to function as a 'www-interface gateway' that works with all existing Mailman raw archives. From what I've been researching, one would need some kind of index into the raw mbox file, either a mail summary file format or a database which would contain file seek pointers into the raw mbox. It would then use a ranged HTTP request to retrieve only the particular message body it needs to display (would this work?). Several issues arise which I'd be glad to have input on from the experts on this list: I. Which mbox index / mail summary file format to use? The Mozilla .msf format looks like a strong candidate. Does anyone have other suggestions? Does Mailman maintain such a mail summary file and is it publicly accessible by default? II. index / mail summary file performance and maintenance Mozilla .msf files can be regenerated on the fly but for a 100MB mailbox (Python-list's is 600MB+!), it already takes fairly long (a few minutes). Assuming index file corruption is very rare, then this should not be a real problem. III. index / mail summary file hosting issues If an index/mail summary file is not available by default, and such a www-interface gateway were to work with no additional work on the list manager's part, then the index/mail summary file would have to be generated by the machine hosting the gateway instead. - What then, would be the mechanics of the (constant) remote reindexing that would need to be done as new messages come in? Would it be possible to just constantly poll the size of the raw mbox and if it has changed, to just reindex using data starting from the last retrieved file position? - How often do list admins compact/expunge their raw archive mboxes? Everytime they do, afaik, it would require the index / mail summary file to be regenerated. - Is it possible, then, for the www-interface gateway to automatically sense if the remotely hosted raw archive mbox has been expunged/compacted? - Also, how would the www-interface gateway machine know when its index / mail summary file has been corrupted? A second possible approach would be for the www-interface gateway machine to maintain its own copy of the raw archive and constantly rsync it with the one maintained by the list admin. This will probably only be feasible if Mailman list admins provide rsync access to the raw mailbox archive. - Would adding rsync serving of the raw mailbox to Mailman be a good idea? (If it was in Mailman, it is more likely to be enabled by default). -- ========================================= reply-to: a n d y @ n e t f x p h . c o m http://www.neotitans.com Web and Software Development From camtech at white-wolf.com Mon Oct 6 13:55:08 2003 From: camtech at white-wolf.com (Jerry Spaulding) Date: Thu Oct 9 14:59:49 2003 Subject: [Mailman-Developers] lost data files for filebase In-Reply-To: References: Message-ID: <6.0.0.22.2.20031006135210.034a5b68@mail.white-wolf.com> I saw a person post this error before, as message msg06414.html, however it was never replied to. I am running mailman 2.1.3, on darwin 6.6, with sendmail and python 2.3.2. EVERY message that hits a mailing list generates an error like the following: Oct 06 13:54:06 2003 (3023) lost data files for filebase: 1065441827.233952+cc8c4e6961e97c01925928c146f86bf60d5631e8 Oct 06 13:54:06 2003 (3023) lost data files for filebase: 1065441828.512211+e9303f43ad805907b98d1d6eeabc8d4115814e73 Oct 06 13:54:06 2003 (3023) lost data files for filebase: 1065441836.168941+8b36b6837fd56ddca5445a6364c8f90023a34595 Oct 06 13:54:07 2003 (3023) lost data files for filebase: 1065096079.044391+e238fc06af286fd08d452d4807dbee54368bd3b1 Should I go back to python 2.1.x? Mail is still getting through, but the log files are growing FAST. Thanks for your time. From mcicogni at siosistemi.it Tue Oct 7 13:27:20 2003 From: mcicogni at siosistemi.it (Mauro Cicognini) Date: Thu Oct 9 14:59:51 2003 Subject: [Mailman-Developers] Using pipermail on its own Message-ID: <3F82F778.60006@siosistemi.it> Hi everyone, I know someone must have asked this already but I can't find anything on the subject. Nowadays (as AMK clearly says on his page) pipermail only lives within Mailman. Is it possible, however, to use it by itself? I.e. I'd have to provide a nicer interface to a set of messages which are actually automatically retrieved by a script, and I'd like to avoid reinventing the wheel if at all possible. Since the script is in Python (as most of the development we do here is) I'd rather integrate a Python module than resort to an external archiver like MHonArc, however nice that is. Thanks in advance for any help. Mauro Cicognini From trevor.wendt at aquila.com Thu Oct 9 14:53:04 2003 From: trevor.wendt at aquila.com (Wendt, Trevor) Date: Thu Oct 9 14:59:54 2003 Subject: [Mailman-Developers] Mass Subscriptions Message-ID: Is there a way to Mass Subscribe users with email address AND names? -Trevor From barry at python.org Thu Oct 9 15:03:50 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 9 15:03:56 2003 Subject: [Mailman-Developers] HTML Emails In-Reply-To: References: Message-ID: <1065726230.21979.26.camel@anthem> On Thu, 2003-10-02 at 16:25, Wendt, Trevor wrote: > I'm looking at using Mailman to distribute a multipart Text / HTML email message. The Text message is displayed for Text only email viewers and the HTML message is displayed for HTML capable email viewers. I'm using a separate CGI script to create the multipart email message which is then sent to my distribution list. This process works great. > > My question is this; When using the setup outlined above, are there any tokens (such as %(users)s) that I can use in the HTML email message content so when it is passed through Mailman, Mailman would then replace the token with a user's name, email address, or append the full url for the customer's options page? > > This functionality exists in Mailman because all the email templates work on this method. I guess what I need to know is if and how one can tap into this functionality with pass through emails. Any suggestions would be appreciated. No, not externally. IOW, Mailman doesn't mailmerge. But if you wanted to hack some Python, you could probably make it work. I plan on working on a more efficient personalization implementation for the next release. I'll keep an eye on making it extensible (but only through Python). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031009/5eaa9bad/attachment.bin From barry at python.org Thu Oct 9 15:04:43 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 9 15:04:47 2003 Subject: [Mailman-Developers] maximum limit of addresses per list In-Reply-To: References: Message-ID: <1065726282.21979.28.camel@anthem> On Fri, 2003-10-03 at 13:58, Chris Szilagyi wrote: > Hello, > > We use Mailman 2.1.2 and I am curious if there is a way to set a maximum > number of addresses that are allowed in each individual list. I have looked > through the documentation and have not seen anything regarding a limit like > this. Or, if this is currently not supported in Mailman, has anybody found a > workaround to handle something like this, or are there any provisions for > this feature in any future releases of Mailman?? No, but it is occasionally requested. Add it to the MM2.2 wiki (see www.list.org/dev.html for links). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031009/ed617010/attachment.bin From barry at python.org Thu Oct 9 15:08:36 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 9 15:08:42 2003 Subject: [Mailman-Developers] lost data files for filebase In-Reply-To: <6.0.0.22.2.20031006135210.034a5b68@mail.white-wolf.com> References: <6.0.0.22.2.20031006135210.034a5b68@mail.white-wolf.com> Message-ID: <1065726516.21979.32.camel@anthem> On Mon, 2003-10-06 at 13:55, Jerry Spaulding wrote: > I saw a person post this error before, as message msg06414.html, however it > was never replied to. I am running mailman 2.1.3, on darwin 6.6, with > sendmail and python 2.3.2. EVERY message that hits a mailing list generates > an error like the following: > > Oct 06 13:54:06 2003 (3023) lost data files for filebase: > 1065441827.233952+cc8c4e6961e97c01925928c146f86bf60d5631e8 > Oct 06 13:54:06 2003 (3023) lost data files for filebase: > 1065441828.512211+e9303f43ad805907b98d1d6eeabc8d4115814e73 > Oct 06 13:54:06 2003 (3023) lost data files for filebase: > 1065441836.168941+8b36b6837fd56ddca5445a6364c8f90023a34595 > Oct 06 13:54:07 2003 (3023) lost data files for filebase: > 1065096079.044391+e238fc06af286fd08d452d4807dbee54368bd3b1 > > Should I go back to python 2.1.x? Mail is still getting through, but the > log files are growing FAST. Thanks for your time. No, I wouldn't downgrade. If you can, send me the list's config.pck file and I'll look to see if there's anything crazy. Also, you might want to set SYNC_AFTER_WRITE in Switchboard.py just to see if that fixes things (although I doubt it if this is only affecting one list). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031009/3c0cde5a/attachment.bin From barry at python.org Thu Oct 9 15:10:41 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 9 15:10:47 2003 Subject: [Mailman-Developers] Mass Subscriptions In-Reply-To: References: Message-ID: <1065726641.21979.35.camel@anthem> On Thu, 2003-10-09 at 14:53, Wendt, Trevor wrote: > Is there a way to Mass Subscribe users with email address AND names? Yes, just add lines like: aperson@example.com (Anne Person) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031009/5e8bef36/attachment.bin From trevor.wendt at aquila.com Thu Oct 9 15:23:35 2003 From: trevor.wendt at aquila.com (Wendt, Trevor) Date: Thu Oct 9 15:53:08 2003 Subject: [Mailman-Developers] Mass Subscriptions Message-ID: Great. Thanks you. -Trevor -----Original Message----- From: Barry Warsaw [mailto:barry@python.org] Sent: Thursday, October 09, 2003 2:11 PM To: Wendt, Trevor Cc: mailman-developers@python.org Subject: Re: [Mailman-Developers] Mass Subscriptions On Thu, 2003-10-09 at 14:53, Wendt, Trevor wrote: > Is there a way to Mass Subscribe users with email address AND names? Yes, just add lines like: aperson@example.com (Anne Person) -Barry From r.barrett at ftel.co.uk Thu Oct 9 15:58:57 2003 From: r.barrett at ftel.co.uk (Richard Barrett) Date: Thu Oct 9 15:59:05 2003 Subject: [Mailman-Developers] Log file rollovers In-Reply-To: <1065725252.28958.325.camel@finch.boston.redhat.com> Message-ID: <04B07CA3-FA93-11D7-99F5-000A957C9A50@ftel.co.uk> logrotate has a problem with MM 2.1.x because the qrunners operate a daemons and as a consequence do not close the files they are logging to. This leads to the problem you are observing. This is not a problem with MM alone. Anything that runs as a damon typically has to be HUP'ed after the rotation to prevent the problem. Try extending the command logrotate uses to stop mailmanctl before the rotate and start it again after the rotate has been done or restart mailmanctl immediately after the rotation.. On Thursday, October 9, 2003, at 07:47 pm, John Dennis wrote: > You didn't say on what type and version the system is, which is > important because things like log file rollovers aren't really part of > mailman proper but rather are done as part of the system specific > installation. I do know that for a while we (Red Hat) had a bug with > log > file rotation in our RPM, that was fixed about 6 months ago. However > the > bug I'm aware of does not seem like what you are describing. The bug we > fixed would keep generating new files with an extra digit appended to > it > till it eventually filled the file system, e.g. the file log would > become log.1, then log.1.1, then log.1.1.1 rather than log.2. Exact > details may vary as I'm going from memory. > > Best thing for you to do is let us know the exact type of system, the > contents of /etc/logrotate.conf and /etc/logrotate.d/mailman (assuming > you have a system with this version of logrotate. > > HTH, > > John > > > On Thu, 2003-10-09 at 14:14, Nadim Shaikli wrote: >> I've noticed some weird happenings with Mailman (v-2.1.2). As >> expected >> on the first of each month all the log files get rolled over (so vette >> becomes vette.1 and bounce becomes bounce.1, etc). Every so often I >> see >> files that are created but are not used. For instance, I'm looking at >> my log directory and can see, >> >> -rw-rw-r-- 1 list list 0 Oct 1 06:27 post >> -rw-rw-r-- 1 list list 4819198 Oct 9 08:57 post.1 >> -rw-rw-r-- 1 list list 632841 Sep 10 07:59 post.2.gz >> >> all the current posts are still making it to 'post.1' for some reason >> and not simply 'post'. Mind you in September they were making it to >> the normal 'post' file. >> >> Is this a known problem and is there a work-around or a means to fix >> this sans restarts. I did notice this happening on other files as >> well (subscribe, error, etc) - the problem resolves itself if you >> restart mailman. >> >> Regards, >> >> - Nadim >> >> >> __________________________________ >> Do you Yahoo!? >> The New Yahoo! Shopping - with improved product search >> http://shopping.yahoo.com >> >> _______________________________________________ >> Mailman-Developers mailing list >> Mailman-Developers@python.org >> http://mail.python.org/mailman/listinfo/mailman-developers > -- > John Dennis > > > _______________________________________________ > Mailman-Developers mailing list > Mailman-Developers@python.org > http://mail.python.org/mailman/listinfo/mailman-developers > From brad.knowles at skynet.be Thu Oct 9 15:45:23 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Thu Oct 9 16:15:26 2003 Subject: [Mailman-Developers] rsync'able Mailman archive In-Reply-To: <3F7E45B5.7040708@nospam.com> References: <3F7E45B5.7040708@nospam.com> Message-ID: At 11:59 AM +0800 2003/10/04, Andy Sy wrote: > Is making the archive rsync'able a responsibility of the > list administrator Yes. It's a file. That level of control should be left to the OS and how you configure standard tools like rsync or ssync. > or... wouldn't it be better to build > such support in Mailman (to allow people who are OC > about getting the complete list to save time and avoid > wasting gobs of bandwidth)? Do you really want to include all the rsync code into mailman? I don't think so. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From brad.knowles at skynet.be Thu Oct 9 15:57:50 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Thu Oct 9 16:20:06 2003 Subject: [Mailman-Developers] Retrieving individual messages from raw Mailman mboxes via http In-Reply-To: <3F7FDFF9.2090206@nospam.com> References: <3F7FDFF9.2090206@nospam.com> Message-ID: At 5:10 PM +0800 2003/10/05, Andy Sy wrote: > I. Which mbox index / mail summary file format to use? Import into Berkeley DB hash tables. Fast, easy, well-supported by many languages, robust, data can easily be extracted if necessary, and they can easily be reconstructed if necessary. Failing that, use a mailbox-directory format. > II. index / mail summary file performance and maintenance > > Mozilla .msf files can be regenerated on the fly but > for a 100MB mailbox (Python-list's is 600MB+!), it already takes > fairly long (a few minutes). Assuming index file corruption is > very rare, then this should not be a real problem. I would be willing to bet that Berkeley DB files could be regenerated even faster -- much faster. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From r.barrett at openinfo.co.uk Thu Oct 9 16:23:30 2003 From: r.barrett at openinfo.co.uk (Richard Barrett) Date: Thu Oct 9 16:29:48 2003 Subject: [Mailman-Developers] Mailman/pipermail/MHonArc integration patch Message-ID: <7273BB74-FA96-11D7-99F5-000A957C9A50@openinfo.co.uk> I have posted a new enhancement patch for MM 2.1.3 to sourceforge. The Mailman/pipermail/MHonArc integration patch tightly integrates the MHonArc mail-to-HTML convertor with Mailman and its internal pipermail archiving code. The purpose of the patch is to produce a fusion of (hopefully) the best features of pipermail and MHonArc for handling Mailman mailing list archives. For more detail see patch content: http://sourceforge.net/tracker/ ?func=detail&aid=820723&group_id=103&atid=300103 or: http://www.openinfo.co.uk/mailman/patches/mhonarc/index.html Any problems or comments, let me know. ----------------------------------------------------------------------- Richard Barrett http://www.openinfo.co.uk From shaikli at yahoo.com Thu Oct 9 20:34:51 2003 From: shaikli at yahoo.com (Nadim Shaikli) Date: Thu Oct 9 20:34:56 2003 Subject: [Mailman-Developers] Log file rollovers In-Reply-To: <1065725252.28958.325.camel@finch.boston.redhat.com> Message-ID: <20031010003451.83211.qmail@web14915.mail.yahoo.com> --- John Dennis wrote: > You didn't say on what type and version the system is, which is > important because things like log file rollovers aren't really part of > mailman proper but rather are done as part of the system specific Sorry about that. I'm on a debian box running 'debian testing (sarge)' uname -a: Linux pi 2.4.21 #2 Tue Jul 15 21:49:17 PDT 2003 i686 GNU/Linux The installed debianized mailman package info, Package: mailman Status: install ok installed Installed-Size: 21352 Maintainer: Tollef Fog Heen Version: 2.1.2-6 > installation. I do know that for a while we (Red Hat) had a bug with log > file rotation in our RPM, that was fixed about 6 months ago. However the > bug I'm aware of does not seem like what you are describing. The bug we > fixed would keep generating new files with an extra digit appended to it > till it eventually filled the file system, e.g. the file log would > become log.1, then log.1.1, then log.1.1.1 rather than log.2. Exact > details may vary as I'm going from memory. > > Best thing for you to do is let us know the exact type of system, the > contents of /etc/logrotate.conf and /etc/logrotate.d/mailman (assuming > you have a system with this version of logrotate. I'm attaching the logrotate files per your suggestion. Nothing looked out of place to me. The file is being created correctly, I'm just surprised that mailman would even touch a non-existent as far as its concerned (mailman know not about post.1 and logrotate) unless the file-handler is kept in a constant open state by mailman or something. Let me know what you find out. Thanks. - Nadim __________________________________ Do you Yahoo!? The New Yahoo! Shopping - with improved product search http://shopping.yahoo.com -------------- next part -------------- A non-text attachment was scrubbed... Name: logrotate.tar.gz Type: application/gzip Size: 1279 bytes Desc: logrotate.tar.gz Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031009/637e6762/logrotate.tar.bin From brad.knowles at skynet.be Thu Oct 9 20:43:10 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Thu Oct 9 21:10:31 2003 Subject: [Mailman-Developers] Log file rollovers In-Reply-To: <20031010003451.83211.qmail@web14915.mail.yahoo.com> References: <20031010003451.83211.qmail@web14915.mail.yahoo.com> Message-ID: At 5:34 PM -0700 2003/10/09, Nadim Shaikli wrote: > Nothing looked out of place to me. The file is being created correctly, > I'm just surprised that mailman would even touch a non-existent as far as > its concerned (mailman know not about post.1 and logrotate) unless the > file-handler is kept in a constant open state by mailman or something. When the file is renamed, the inode number and internal file handle do not change. Mailman opens the file on start, and doesn't close it. The file gets renamed, but this change is not relevant to mailman -- it just keeps using the filehandle it's already got. This is why you have to use mailmanctl to stop and restart mailman. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From barry at python.org Fri Oct 10 00:31:36 2003 From: barry at python.org (Barry Warsaw) Date: Fri Oct 10 00:31:45 2003 Subject: [Mailman-Developers] Problem with MM after power outage In-Reply-To: <20030915185033.GE27313@lenin.nu> References: <1061330802.1131.18.camel@geddy> <200308200045.23234.pioppo@ferrara.linux.it> <1063373997.19907.44.camel@anthem> <1063384531.19907.67.camel@anthem> <87wuce6j47.fsf@athene.jamux.com> <20030912211425.GK5719@lenin.nu> <1063403760.19907.93.camel@anthem> <20030915185033.GE27313@lenin.nu> Message-ID: <1065760296.23296.6.camel@anthem> On Mon, 2003-09-15 at 14:50, Peter C. Norton wrote: > On Fri, Sep 12, 2003 at 05:56:00PM -0400, Barry Warsaw wrote: > > On Fri, 2003-09-12 at 17:40, Harald Meland wrote: > > > Hence, I think it makes more sense to have the default be "do > > > fsync(2)", and let any performance-conscious site decide whether it > > > wants to explicitly value performance over safety. > > > > Except that when I did some very simple tests, I saw a 97% hit in > > performance with fsync turned on. This on a RH9, ext3 Linux box of the > > Dell Optiplex variety. That makes me very nervous to add in a patch > > release that won't have any beta testing. I've also never seen the bug > > on python.org, which may or may not be representative of the world at > > large. > > Wow. 97%? That's way too high. I'd expect about 50% at worst - for > the extra sync to disk when it enters mailman's queue and one more to > flush the message when its made it though the outbound queue to the > MTA. This is just a question, because I still don't know much about > the mm 2.1 internals, but is there a chance you're sync()'ing more > often then you need? It's possible -- I don't have my test script any more. I just added a flush before closing the config.pck file and I think that will help much more than the sync. IIUC, there's really only a narrow window of opportunity for corruption that sync will solve, and if you're worried about that, you really should be on a UPS and possibly a sync'ing file system. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031010/c8416863/attachment.bin From terri at zone12.com Fri Oct 10 12:46:55 2003 From: terri at zone12.com (Terri Oda) Date: Fri Oct 10 12:47:35 2003 Subject: [Mailman-Developers] Manually marking something as bouncing? Message-ID: <20031010164655.GA892@ostraya.zone12.com> I know, I should probably be able to find this, but for some reason it's eluding me... Is there some way in which people can manually be marked as bouncing so that the "are you there?" notices still go out? We get a fair number of uncaught bounces and it'd be handy to be able to do that rather than no-mailing the person. Alternatively, how hard is it to add new bounce forms to mailman? This probably won't help most of my admins, since I'm one of very few with the access to the box that I suspect is necessary, but at least I could add in the ones that we see more frequently. Terri From wheakory at isu.edu Fri Oct 10 12:59:13 2003 From: wheakory at isu.edu (Kory Wheatley) Date: Fri Oct 10 12:59:18 2003 Subject: [Mailman-Developers] Bug in mailman 2.1 with unsubscribe Message-ID: <3F86E561.7060503@isu.edu> Is there a bug in mailman 2.1, when trying to use the global password I cannot unsubscribe subscribers from a mailing list using an email command. I've tried the following, "I've been able to do this in past verisons with Mailman". I sent a request to "test-request@mm.isu.edu" with the following unsubscribe globalpassword address=emailaccount@isu.edu end -- Kory Wheatley Academic Computing Analyst Sr. Phone 282-3874 ######################################### Everything must point to him. From barry at python.org Fri Oct 10 14:35:22 2003 From: barry at python.org (Barry Warsaw) Date: Fri Oct 10 14:35:29 2003 Subject: [Mailman-Developers] Manually marking something as bouncing? In-Reply-To: <20031010164655.GA892@ostraya.zone12.com> References: <20031010164655.GA892@ostraya.zone12.com> Message-ID: <1065810921.23296.910.camel@anthem> On Fri, 2003-10-10 at 12:46, Terri Oda wrote: > I know, I should probably be able to find this, but for some reason it's > eluding me... > > Is there some way in which people can manually be marked as bouncing so that > the "are you there?" notices still go out? We get a fair number of uncaught > bounces and it'd be handy to be able to do that rather than no-mailing the > person. You can't really do this through the web, but it wouldn't be hard to hack a bin/withlist script to do it. Add this to the 2.2 wiki and I'll think about it. > Alternatively, how hard is it to add new bounce forms to mailman? This > probably won't help most of my admins, since I'm one of very few with the > access to the box that I suspect is necessary, but at least I could add in > the ones that we see more frequently. It's not that hard, but to be honest, I'm pretty unmotivated to add new bounce detectors, since verp is so sweet. I know, lots of folks have sent me uncaught bounce formats and such, and I really should add detectors for them. (Bad project leader, bad!) If you do hack some new ones out, upload them as patches to SF. I'll do a sweep of that stuff for 2.2. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031010/812e5300/attachment.bin From barry at python.org Fri Oct 10 15:09:20 2003 From: barry at python.org (Barry Warsaw) Date: Fri Oct 10 15:09:29 2003 Subject: [Mailman-Developers] CVS re-organization for MM 2.2 Message-ID: <1065812960.23296.981.camel@anthem> I'm starting to work on straight up MM 2.2 features, which means that the head and the -maint branch are going to start diverging. This will be especially painful for the i18n'ers because I think it'll start to be nearly impossible to backport .po file changes. I've sent a separate message to mailman-i18n outline my hope that we can join the Translation Project [1] For these and other reasons, I'm probably going to do some reorganization of the CVS repository (probably forgoing the revision history -- ah Subversion :). One of the planned changes is to move some of the top level directories out of into a separate CVS module. I'm pretty sure I want to move admin, but there may be others. I'm probably also going to move the READMEs into doc. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031010/62a6ec2f/attachment.bin From mlucas at rice.edu Fri Oct 10 15:18:04 2003 From: mlucas at rice.edu (Mike Lucas) Date: Fri Oct 10 15:16:21 2003 Subject: [Mailman-Developers] question about passwords. Message-ID: <3F8705EC.7000403@rice.edu> If I make a bunch of lists using the newlist script with the -q option to not notify the owner. Is there a way to after the fact send the password to the owner? The reason i am asking is we are migrating from Lsoft to mailman and are moving all our lists over to mailman. In the process we would like to add the lists and then use the config_list script to add all the settings of each list over with all the owners/moderators. After we have it all moved over to the new system we then want to have an email sent to all the owners/moderators that tells them thier password for administrating thier list. The thing is that the newlist script makes the list with 1 owner. Then we use the Config_list to add all the other owners and moderators. I do not think that when i use the config_list script it will send an email to all the owners/moderators with a password to use. Is there a way to do this i am not seeing? any info would be great. Thanks, Mike From mlucas at rice.edu Fri Oct 10 15:21:53 2003 From: mlucas at rice.edu (Mike Lucas) Date: Fri Oct 10 15:20:10 2003 Subject: [Mailman-Developers] mailman and SSL. Message-ID: <3F8706D1.9000104@rice.edu> has anyone made Mailman's web interface use SSL? Any info about how you did it would be great. thanks, Mike From stijn at win.tue.nl Fri Oct 10 15:27:23 2003 From: stijn at win.tue.nl (Stijn Hoop) Date: Fri Oct 10 15:27:17 2003 Subject: [Mailman-Developers] mailman and SSL. In-Reply-To: <3F8706D1.9000104@rice.edu> References: <3F8706D1.9000104@rice.edu> Message-ID: <20031010192723.GD79376@pcwin002.win.tue.nl> On Fri, Oct 10, 2003 at 02:21:53PM -0500, Mike Lucas wrote: > has anyone made Mailman's web interface use SSL? > > Any info about how you did it would be great. I have: DEFAULT_URL_PATTERN = 'https://%s/mailman/' PUBLIC_ARCHIVE_URL = 'https://%(hostname)s/archive/%(listname)s' in mm_cfg.py, and ScriptAlias /mailman/ /local/mailman/mailman/cgi-bin/ Options FollowSymLinks ExecCGI AllowOverride None Order allow,deny Allow from all Alias /archive/ /local/mailman/mailman/archives/public/ in ssl.conf for Apache 2, using the FreeBSD port, mailman installed to /local/mailman/mailman. I forgot if I had to configure more, but I think not. HTH, --Stijn -- There are of course many problems connected with life, of which some of the most popular are 'Why are people born?', 'Why do they die?', and `Why do they spend so much of the intervening time wearing digital watches?' -- Douglas Adams, "The Hitchhikers Guide To The Galaxy" -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031010/32a81113/attachment.bin From barry at python.org Fri Oct 10 18:10:33 2003 From: barry at python.org (Barry Warsaw) Date: Fri Oct 10 18:10:39 2003 Subject: [Mailman-Developers] question about passwords. In-Reply-To: <3F8705EC.7000403@rice.edu> References: <3F8705EC.7000403@rice.edu> Message-ID: <1065823833.23296.984.camel@anthem> On Fri, 2003-10-10 at 15:18, Mike Lucas wrote: > If I make a bunch of lists using the newlist script with the -q option > to not notify the owner. Is there a way to after the fact send the > password to the owner? The reason i am asking is we are migrating from > Lsoft to mailman and are moving all our lists over to mailman. In the > process we would like to add the lists and then use the config_list > script to add all the settings of each list over with all the > owners/moderators. After we have it all moved over to the new system we > then want to have an email sent to all the owners/moderators that tells > them thier password for administrating thier list. > > The thing is that the newlist script makes the list with 1 owner. Then > we use the Config_list to add all the other owners and moderators. I do > not think that when i use the config_list script it will send an email > to all the owners/moderators with a password to use. > > Is there a way to do this i am not seeing? any info would be great. Not really, and yes, this sucks. The problem is that the admin password is not kept in the clear, so there's no way to recreate it after the fact. The best you can do is to write a little bin/withlist script to reset the admin password and mail it to the list owners. You'd run this after you ran config_list. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031010/1e8ad758/attachment.bin From lj at mandala-designs.com Sun Oct 12 21:53:11 2003 From: lj at mandala-designs.com (ljacobs ) Date: Sun Oct 12 21:53:18 2003 Subject: [Mailman-Developers] moderator bit Message-ID: <200310122153.AA111214694@mandala-designs.com> I am trying to use Mailman 2.1.3 on a FreeBSD 4.8 system, python 2.2.2. However, with any lists I create the moderator bit is off be default. I can change the bit to "moderator on" but even after doing that, the membership page changes back to "moderator off". Is this a feature or some bug? Thanks. ________________________________________________________________ Sent via the WebMessaging system at mandala-designs.com --- [This E-mail scanned for viruses by Declude Virus] From RabbiSim at earthlink.net Mon Oct 13 14:48:20 2003 From: RabbiSim at earthlink.net (Rabbi Simcha Backman) Date: Mon Oct 13 14:48:33 2003 Subject: [Mailman-Developers] Mailman Integration Message-ID: <002801c391ba$92f4b0c0$6601a8c0@smile> Hi there, I'm searching for an experienced programmer who is familiar with mailman and its code and can integrate it entirely into our site. I'm willing to hire either on an hourly basis or for the job. Please contact me directly at: sbackman@askmoses.com Thanks Simcha Backman From shaikli at yahoo.com Mon Oct 13 15:20:26 2003 From: shaikli at yahoo.com (Nadim Shaikli) Date: Mon Oct 13 15:20:30 2003 Subject: [Mailman-Developers] Log file rollovers In-Reply-To: Message-ID: <20031013192026.54511.qmail@web14901.mail.yahoo.com> --- Brad Knowles wrote: > At 5:34 PM -0700 2003/10/09, Nadim Shaikli wrote: > > >Nothing looked out of place to me. The file is being created > >correctly, I'm just surprised that mailman would even touch a > >non-existent as far asits concerned (mailman know not about > >post.1 and logrotate) unless the file-handler is kept in a > >constant open state by mailman or something. > > When the file is renamed, the inode number and internal file handle > do not change. Mailman opens the file on start, and doesn't close it. > The file gets renamed, but this change is not relevant to mailman -- it > just keeps using the filehandle it's already got. This is why you have > to use mailmanctl to stop and restart mailman. Well, if the statement "[MM] opens the file on start, and doesn't close it" is true, then it sounds very worrying. Shouldn't Mailman open/close upon prompting instead (ie. upon arrival of mail and/or when prompted for action) ? Is this a big deal to change/fix ? Regards, - Nadim __________________________________ Do you Yahoo!? The New Yahoo! Shopping - with improved product search http://shopping.yahoo.com From r.barrett at openinfo.co.uk Mon Oct 13 15:35:40 2003 From: r.barrett at openinfo.co.uk (Richard Barrett) Date: Mon Oct 13 15:35:50 2003 Subject: [Mailman-Developers] Log file rollovers In-Reply-To: <20031013192026.54511.qmail@web14901.mail.yahoo.com> Message-ID: <6DDB1A48-FDB4-11D7-B141-000A957C9A50@openinfo.co.uk> On Monday, October 13, 2003, at 08:20 pm, Nadim Shaikli wrote: > --- Brad Knowles wrote: >> At 5:34 PM -0700 2003/10/09, Nadim Shaikli wrote: >> >>> Nothing looked out of place to me. The file is being created >>> correctly, I'm just surprised that mailman would even touch a >>> non-existent as far asits concerned (mailman know not about >>> post.1 and logrotate) unless the file-handler is kept in a >>> constant open state by mailman or something. >> >> When the file is renamed, the inode number and internal file handle >> do not change. Mailman opens the file on start, and doesn't close it. >> The file gets renamed, but this change is not relevant to mailman -- >> it >> just keeps using the filehandle it's already got. This is why you >> have >> to use mailmanctl to stop and restart mailman. > > Well, if the statement "[MM] opens the file on start, and > doesn't close it" is true, then it sounds very worrying. Why? > Shouldn't Mailman open/close upon prompting instead > (ie. upon arrival of mail and/or when prompted for action) ? > Why? > Is this a big deal to change/fix ? > Why change it? > Regards, > > - Nadim > ----------------------------------------------------------------------- Richard Barrett http://www.openinfo.co.uk From h.rauch at help.hessen.de Mon Oct 13 02:23:26 2003 From: h.rauch at help.hessen.de (Hans Rauch) Date: Tue Oct 14 11:26:57 2003 Subject: [Mailman-Developers] umlaut-problem? Message-ID: <1066026205.2591.18.camel@hrauch.intranet> Hi, you told us to send the code snipet. Here iti is: Bug in Mailman version 2.1.3 We're sorry, we hit a bug! If you would like to help us identify the problem, please email a copy of this page to the webmaster for this site with a description of what happened. Thanks! Traceback: Traceback (most recent call last): File "/usr/local/mailman/scripts/driver", line 87, in run_main main() File "/usr/local/mailman/Mailman/Cgi/admin.py", line 192, in main show_results(mlist, doc, category, subcat, cgidata) File "/usr/local/mailman/Mailman/Cgi/admin.py", line 491, in show_results form.AddItem(membership_options(mlist, subcat, cgidata, doc, form)) File "/usr/local/mailman/Mailman/Cgi/admin.py", line 799, in membership_options all = [_m.encode() for _m in mlist.getMembers()] UnicodeError: ASCII decoding error: ordinal not in range(128) ________________________________________________________________________ Python information: Variable Value sys.version 2.1.3 (#1, May 5 2002, 12:40:43) [GCC 2.95.3 20010315 (SuSE)] sys.executable /usr/local/bin/python sys.prefix /usr/local sys.exec_prefix /usr/local sys.path /usr/local sys.platform linux2 ________________________________________________________________________ Environment variables: Variable Value DOCUMENT_ROOT /home/www/doc SERVER_ADDR 192.168.4.200 HTTP_ACCEPT_ENCODING gzip,deflate,compress;q=0.9 SERVER_PORT 80 PATH_TRANSLATED /home/www/doc/medien-givb/members REMOTE_ADDR 80.128.228.79 SERVER_SOFTWARE Apache/1.3.26 (Unix) PHP/4.2.2 UNIQUE_ID P4pEIsCoBMgAAAIOCIg HTTP_ACCEPT_LANGUAGE de,en;q=0.5 REMOTE_PORT 32986 SERVER_NAME komm.bildung.hessen.de HTTP_CONNECTION keep-alive HTTP_USER_AGENT Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3.1) Gecko/20030425 HTTP_ACCEPT_CHARSET ISO-8859-1,utf-8;q=0.7,*;q=0.7 HTTP_ACCEPT text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1 REQUEST_URI /mailman/admin/medien-givb/members QUERY_STRING SCRIPT_FILENAME /usr/local/mailman/cgi-bin/admin HTTP_KEEP_ALIVE 300 HTTP_HOST mailman.bildung.hessen.de REQUEST_METHOD GET SERVER_SIGNATURE Apache/1.3.26 Server at komm.bildung.hessen.de Port 80 SCRIPT_NAME /mailman/admin SERVER_ADMIN h.rauch@help.hessen.de GATEWAY_INTERFACE CGI/1.1 PYTHONPATH /usr/local/mailman PATH_INFO /medien-givb/members HTTP_COOKIE test+admin=28020000006929438a3f732800000035396334333234633134353661373335326232386161656661653131333135666437396432653934; medien-givb+admin=28020000006920448a3f732800000034373262383065353230326261363035373163306636353666346635636333376438323434643561 SERVER_PROTOCOL HTTP/1.1 HTTP_REFERER http://mailman.bildung.hessen.de/mailman/admin/medien-givb Bug in Mailman version 2.1.3 We're sorry, we hit a bug! If you would like to help us identify the problem, please email a copy of this page to the webmaster for this site with a description of what happened. Thanks! Traceback: Traceback (most recent call last): File "/usr/local/mailman/scripts/driver", line 87, in run_main main() File "/usr/local/mailman/Mailman/Cgi/admin.py", line 192, in main show_results(mlist, doc, category, subcat, cgidata) File "/usr/local/mailman/Mailman/Cgi/admin.py", line 491, in show_results form.AddItem(membership_options(mlist, subcat, cgidata, doc, form)) File "/usr/local/mailman/Mailman/Cgi/admin.py", line 799, in membership_options all = [_m.encode() for _m in mlist.getMembers()] UnicodeError: ASCII decoding error: ordinal not in range(128) ________________________________________________________________________ Python information: Variable Value sys.version 2.1.3 (#1, May 5 2002, 12:40:43) [GCC 2.95.3 20010315 (SuSE)] sys.executable /usr/local/bin/python sys.prefix /usr/local sys.exec_prefix /usr/local sys.path /usr/local sys.platform linux2 ________________________________________________________________________ Environment variables: Variable Value DOCUMENT_ROOT /home/www/doc SERVER_ADDR 192.168.4.200 HTTP_ACCEPT_ENCODING gzip,deflate,compress;q=0.9 SERVER_PORT 80 PATH_TRANSLATED /home/www/doc/medien-givb/members REMOTE_ADDR 80.128.228.79 SERVER_SOFTWARE Apache/1.3.26 (Unix) PHP/4.2.2 UNIQUE_ID P4pEIsCoBMgAAAIOCIg HTTP_ACCEPT_LANGUAGE de,en;q=0.5 REMOTE_PORT 32986 SERVER_NAME komm.bildung.hessen.de HTTP_CONNECTION keep-alive HTTP_USER_AGENT Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3.1) Gecko/20030425 HTTP_ACCEPT_CHARSET ISO-8859-1,utf-8;q=0.7,*;q=0.7 HTTP_ACCEPT text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1 REQUEST_URI /mailman/admin/medien-givb/members QUERY_STRING SCRIPT_FILENAME /usr/local/mailman/cgi-bin/admin HTTP_KEEP_ALIVE 300 HTTP_HOST mailman.bildung.hessen.de REQUEST_METHOD GET SERVER_SIGNATURE Apache/1.3.26 Server at komm.bildung.hessen.de Port 80 SCRIPT_NAME /mailman/admin SERVER_ADMIN h.rauch@help.hessen.de GATEWAY_INTERFACE CGI/1.1 PYTHONPATH /usr/local/mailman PATH_INFO /medien-givb/members HTTP_COOKIE test+admin=28020000006929438a3f732800000035396334333234633134353661373335326232386161656661653131333135666437396432653934; medien-givb+admin=28020000006920448a3f732800000034373262383065353230326261363035373163306636353666346635636333376438323434643561 SERVER_PROTOCOL HTTP/1.1 HTTP_REFERER http://mailman.bildung.hessen.de/mailman/admin/medien-givb -- ----------------------------------- Hessisches Landesinstitut f?r P?dagogik (HeLP) Hans Rauch Stuttgarter Stra?e 18-24, 60329 Frankfurt/M Fon 069 / 389 89 223 Fax 069 / 389 89 222 h.rauch@help.hessen.de ----------------------------------- From mike at logomanager.co.uk Tue Oct 14 08:11:10 2003 From: mike at logomanager.co.uk (Mike Bradley) Date: Tue Oct 14 11:27:58 2003 Subject: [Mailman-Developers] Serious I/O contention issue Message-ID: <000001c3924c$41808f60$6501a8c0@Mike> I have a rather large mailing list (35000 members) which is used for announcements only, but I have been experiencing terrible performance problems when the qrunner is active. The mail server is Postfix, Mailman is version 2.1.3 (previous versions showed the same problems). When I send an outgoing mail, the qrunner processes take up all my CPU and the hard disk becomes inaccessible due to constant IO. Even when there is very little activity in the list and the queues are cleared, the server locks up for several seconds every time an operation is performed. I have not had much experience with Python, but have tried to track down what might be causing this. It seems that the OutgoingRunner and VirginRunner in particular are causing the worst of the problems, specifically during load and save of the mlist. It appears that the list is being reloaded or resaved from disk after EVERY operation (so rendering useless the handy looking caching of the list in the main Runner class). I am not familiar with the exact way the application operates, but are the mlist arrays shared/marshalled across the qrunner processes via the _listcache? If so, then would it be possible to eliminate the mlist.Load() in the OutgoingRunner and implement delayed saving in the mailing list class so that Save() didn't cause a write to disk each time? For the moment, I have had to shut down my list completely as it is chewing my server! Mike -- Make more of your Nokia Phone with LogoManager http://www.logomanager.co.uk From brad.knowles at skynet.be Tue Oct 14 18:14:53 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Tue Oct 14 18:49:42 2003 Subject: [Mailman-Developers] Serious I/O contention issue In-Reply-To: <000001c3924c$41808f60$6501a8c0@Mike> References: <000001c3924c$41808f60$6501a8c0@Mike> Message-ID: At 1:11 PM +0100 2003/10/14, Mike Bradley wrote: > I have a rather large mailing list (35000 members) which is used for > announcements only, but I have been experiencing terrible performance > problems when the qrunner is active. The mail server is Postfix, > Mailman is version 2.1.3 (previous versions showed the same problems). Just checking, but have you seen the following FAQ entries? See: I figure you probably have already seen them, but I wanted to be sure. > When I send an outgoing mail, the qrunner processes take up all my CPU > and the hard disk becomes inaccessible due to constant IO. Even when > there is very little activity in the list and the queues are cleared, > the server locks up for several seconds every time an operation is > performed. What about the machine you're doing this on? The filesystem? How is postfix configured? If you read Nick Christenson's book _Sendmail Performance Tuning_, please note that many of the criteria are also applicable to other MTAs. See for more info. Disclaimer: Nick was my co-author for a talk I gave previously (see ), and I was a technical reviewer of this book. Clearly, there may well be lots of opportunity here to tune your filesystem for maximum performance. > I am not familiar with the exact way the application operates, but are > the mlist arrays shared/marshalled across the qrunner processes via the > _listcache? If so, then would it be possible to eliminate the > mlist.Load() in the OutgoingRunner and implement delayed saving in the > mailing list class so that Save() didn't cause a write to disk each > time? Depending on what your "SMTP_MAX_RCPTS" value is set to, I would imagine that this should be loaded and saved each time a message is passed from your mailman qrunner to postfix. The higher SMTP_MAX_RCPTS, the less often this process should occur. Of course, others have found that SMTP_MAX_RCPTS should typically be set somewhere between 2 and 10 (usually ~5) for best overall performance (see the FAQ entries above). -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From shaikli at yahoo.com Tue Oct 14 20:19:20 2003 From: shaikli at yahoo.com (Nadim Shaikli) Date: Tue Oct 14 20:19:24 2003 Subject: [Mailman-Developers] Log file rollovers In-Reply-To: <6DDB1A48-FDB4-11D7-B141-000A957C9A50@openinfo.co.uk> Message-ID: <20031015001920.8804.qmail@web14903.mail.yahoo.com> --- Richard Barrett wrote: > > On Monday, October 13, 2003, at 08:20 pm, Nadim Shaikli wrote: > > --- Brad Knowles wrote: > >> At 5:34 PM -0700 2003/10/09, Nadim Shaikli wrote: > >> > >>> Nothing looked out of place to me. The file is being created > >>> correctly, I'm just surprised that mailman would even touch a > >>> non-existent as far asits concerned (mailman know not about > >>> post.1 and logrotate) unless the file-handler is kept in a > >>> constant open state by mailman or something. > >> > >> When the file is renamed, the inode number and internal file handle > >> do not change. Mailman opens the file on start, and doesn't close it. > >> The file gets renamed, but this change is not relevant to mailman -- > >> it just keeps using the filehandle it's already got. This is why you > >> have to use mailmanctl to stop and restart mailman. > > > > Well, if the statement "[MM] opens the file on start, and > > doesn't close it" is true, then it sounds very worrying. > > Why? Because it causes the problem stated earlier in this thread. > > Shouldn't Mailman open/close upon prompting instead > > (ie. upon arrival of mail and/or when prompted for action) ? > > Why? See answer to above question - logrotate would have no affect on mailman and because I don't want to kill and restart mailman every start of month (or so) to initiate a new group of log files. > > Is this a big deal to change/fix ? > > > > Why change it? I can see all your questions are leading to the same point. I highly recommend you read the problem statement on this thread. It seems to me that logrotate (or any other tool) should simply be able to move files for archiving reasons, mailman currently doesn't allow that due to the perceived statement noted above. 1. It the "[MM] opens the file on start, and doesn't close it" statement true ? 2. If so, was it done like that on purpose ? and why ? Regards, - Nadim __________________________________ Do you Yahoo!? The New Yahoo! Shopping - with improved product search http://shopping.yahoo.com From brad.knowles at skynet.be Tue Oct 14 20:42:25 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Tue Oct 14 20:42:36 2003 Subject: [Mailman-Developers] Log file rollovers In-Reply-To: <20031015001920.8804.qmail@web14903.mail.yahoo.com> References: <20031015001920.8804.qmail@web14903.mail.yahoo.com> Message-ID: At 5:19 PM -0700 2003/10/14, Nadim Shaikli wrote: >> > Well, if the statement "[MM] opens the file on start, and >> > doesn't close it" is true, then it sounds very worrying. >> >> Why? > > Because it causes the problem stated earlier in this thread. This is the standard way all programs work under Unix -- you open the file, you get a filehandle. With the filehandle open, the file can be renamed, moved, or anything else -- the original process will continue writing to the filehandle it has open, and those inodes will continue to be in use. This is a common cause of "missing" space on a filesystem -- when you use "du" to show the amount of space currently in use, it doesn't match what "df" shows, and the difference is due to filehandles that are still being held open but for which there are no known files visible from any directory listing that currently references those inodes. > See answer to above question - logrotate would have no affect on > mailman and because I don't want to kill and restart mailman > every start of month (or so) to initiate a new group of log files. This is the way Unix works. Get used to it. > I can see all your questions are leading to the same point. I > highly recommend you read the problem statement on this thread. See above. > It seems to me that logrotate (or any other tool) should simply > be able to move files for archiving reasons, mailman currently > doesn't allow that due to the perceived statement noted above. Get to know how the OS works. Mailman could only solve this problem by opening the file for every single write and then closing it immediately thereafter. Believe me, this would kill your performance so much that it simply is not practical. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From colinp at waikato.ac.nz Tue Oct 14 20:55:29 2003 From: colinp at waikato.ac.nz (Colin Palmer) Date: Tue Oct 14 20:55:34 2003 Subject: [Mailman-Developers] Log file rollovers In-Reply-To: <20031013192026.54511.qmail@web14901.mail.yahoo.com> References: <20031013192026.54511.qmail@web14901.mail.yahoo.com> Message-ID: <1066179329.1397.53.camel@firefox.cc.waikato.ac.nz> On Tue, 2003-10-14 at 08:20, Nadim Shaikli wrote: > Well, if the statement "[MM] opens the file on start, and > doesn't close it" is true, then it sounds very worrying. > Shouldn't Mailman open/close upon prompting instead > (ie. upon arrival of mail and/or when prompted for action) ? It can be told to manually: 'mailmanctl reopen' (Making logrotate call this after archiving mailman's logs is left as an exercise for the reader, there should be plenty of examples to copy from) -- Colin Palmer University of Waikato, ITS Division From r.barrett at openinfo.co.uk Wed Oct 15 07:54:59 2003 From: r.barrett at openinfo.co.uk (Richard Barrett) Date: Wed Oct 15 07:55:39 2003 Subject: [Mailman-Developers] Re: [Mailman-Users] Zest as an Archiver In-Reply-To: <3F8D19E3.5070506@student.umist.ac.uk> Message-ID: <67440EFC-FF06-11D7-B141-000A957C9A50@openinfo.co.uk> Hi Iain On Wednesday, October 15, 2003, at 10:56 am, Iain Bapty wrote: > Hey, > > I'm a 3rd year Computer Science student at UMIST in Manchester, UK > just starting my 3rd year project. My project is to create a new > archiver component for Mailman based on Zest. Sounds like an project that could be of interest to the Mailman community and certainly to me. That said, I could not find out much about Zest from the sourceforge project page but presumably you have better access. > I'm posting this message to both User and Developer lists as I would > appreciate feedback from as many people as possible (for my > requirements capture stage). > > I have a number of questions > > * What problems exist with the Pipermail Archiver? Having contributed patches to to tightly integrate HTdig with MM/pipermail for archive search and MHonArc with MM/pipermail for HTML archive index and message page generation you can get some idea of what some of my interests are. These patches are no more than stop-gaps to provide a better Mailman-based solution pending a replacement archiver. But that said, you may find the installation notes associated with sourceforge patches #444879, #444884 and #820723 identify some of the deficiencies in Mailman's pipermail archiver. You can find these patches on sourceforge or on my own site at: http://www.openinfo.co.uk/mailman/index.html The archiver class structure and code is not that bad. But they are not that good. The whole thing is a series of interlocked classes calling back and forth which makes code comprehension, maintenance and enhancement a real problem for me. I regularly trip over aspects of the class partitioning and if it were not written in Python (which helps my comprehension tremendously compared with Perl, C, C++, Java, etc) I would have given up a long time ago. The interfaces to allow/facilitate integration of third party elements, such as a search engine or an alternative HTML page generator to Mailman's builtin archiver, are limted to non-existent; these elements must be either very loosely associated or the core pipermail/Mailman code has to be hacked, fairly brutally in my case because I am a clumsy person. I think my primary criticism is the lack of decoupling of the elements that comprise the archiving facility as a whole; top level archive organisation and management versus archive HTML page generation for instance. The per-list options for archive organisation into yearly/monthly/weekly/daily periods appropriate for each list is a good feature of pipermail and should be a design objective for its replacement. This should be related to my comment below on archive aging and related maintenance. > * What features would you like to see in a new Archiver? A properly decoupled design based on a sound class structure which is specifically designed too allow extensibility by third party code. I am thinking here of being able to use sub-classing and/or registration of call back functions to a well defined framework to add extensions to the base capability. While it has its own issues, the model of registered callbacks handling different aspects of the transaction lifecycle used by Apache modules is interesting and demonstrably effective in allowing an open-ended and extensible solution; I am commenting on code organisation not implementation language: stick with Python. Full top level management of the archives, through extensions/additions to MM's list admin web GUI would be a win. A significant number of list owners are using Mailman through things like cPanel, in hosted environments, which deny them access to the command line. Some would say that migrating/making available all of MM's command line options through the web admin GUI would be a good thing. A frequently requested feature which is now unavailable is the ability to "edit" the archives to remove general cruft and/or inappropriate postings via the admin web GUI which, at present, can only be done using the command line and external editors. That said, I would want to see such a feature controlled on a per-list basis; some of the lists on a site I manage are effectively used for archiving email for legal purposes and such editing is thus prohibited. A coherent strategy for handling aging of archive content and deletion of material based on per-list criteria would definitely be on my wish list. The overall management of a list's archive is as important as the minutiae of archive page generation. The handling of multipart MIME messages in the HTML archive needs to be improved; MHonArc has a reputation of being better than MM/pipermail in this respect. Any new archiver must aim to produce a unique identifier, invariant over HTML archive rebuilds, for each posting so that externally held references to the mail archives are undisturbed by rebuilding, except where material has been expunged. I consider the private/public archive facility of MM/pipermail to be a 'must have' feature, which must be preserved over archive search. The ability to change a list from private to public and vice versa without having to rebuild the archives is important. Archiving must be fast and have sensible performance characteristics when dealing with very large archives. pipermail/MM is weak in this respect. If archives are structured by period, the threading should extend across the period boundaries. A number of people have asked for the ability to ask for an archived posting to be mailed out to them in the same manner as when it was originally distributed to susbscribers. Must be a lot more things I want but I'll let you prompt for for further input if you want it. You could to worse than take a look through the mailman-users archives for the last 12 months to find a fair number of criticisms/request regarding MM archiving capability but I guess you already have that in hand. > * Would you be willing for me to email you questions in the future > (not on the groups)? > Fine by me. > Any replies are very much appreciated. > > Thanks a lot > Best of luck. Keep us posted on your progress. > Iain Bapty ----------------------------------------------------------------------- Richard Barrett http://www.openinfo.co.uk From i.bapty at student.umist.ac.uk Wed Oct 15 05:56:51 2003 From: i.bapty at student.umist.ac.uk (Iain Bapty) Date: Wed Oct 15 07:56:38 2003 Subject: [Mailman-Developers] Zest as an Archiver Message-ID: <3F8D19E3.5070506@student.umist.ac.uk> Hey, I'm a 3rd year Computer Science student at UMIST in Manchester, UK just starting my 3rd year project. My project is to create a new archiver component for Mailman based on Zest. I'm posting this message to both User and Developer lists as I would appreciate feedback from as many people as possible (for my requirements capture stage). I have a number of questions * What problems exist with the Pipermail Archiver? * What features would you like to see in a new Archiver? * Would you be willing for me to email you questions in the future (not on the groups)? Any replies are very much appreciated. Thanks a lot Iain Bapty From barry at python.org Wed Oct 15 07:58:38 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 15 07:58:43 2003 Subject: [Mailman-Developers] Zest as an Archiver In-Reply-To: <3F8D19E3.5070506@student.umist.ac.uk> References: <3F8D19E3.5070506@student.umist.ac.uk> Message-ID: <1066219118.17491.24.camel@anthem> On Wed, 2003-10-15 at 05:56, Iain Bapty wrote: > I'm a 3rd year Computer Science student at UMIST in Manchester, UK just > starting my 3rd year project. My project is to create a new archiver > component for Mailman based on Zest. I'm posting this message to both > User and Developer lists as I would appreciate feedback from as many > people as possible (for my requirements capture stage). Folks, please help Iain out. Thanks! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031015/0d00fd1d/attachment.bin From i.bapty at student.umist.ac.uk Wed Oct 15 08:14:52 2003 From: i.bapty at student.umist.ac.uk (Iain Bapty) Date: Wed Oct 15 08:14:55 2003 Subject: [Mailman-Developers] Re: [Mailman-Users] Zest as an Archiver In-Reply-To: <67440EFC-FF06-11D7-B141-000A957C9A50@openinfo.co.uk> References: <67440EFC-FF06-11D7-B141-000A957C9A50@openinfo.co.uk> Message-ID: <3F8D3A3C.8060904@student.umist.ac.uk> Richard Barrett wrote: > Sounds like an project that could be of interest to the Mailman > community and certainly to me. > > That said, I could not find out much about Zest from the sourceforge > project page but presumably you have better access. Ka-Ping Yee's paper on Zest and prototype can be found at http://zesty.ca/zest/ Thanks for you very in-depth reply. I'm sure I will have many more questions once I get into full swing. Iain From mike at logomanager.co.uk Wed Oct 15 08:15:35 2003 From: mike at logomanager.co.uk (Mike Bradley) Date: Wed Oct 15 08:15:38 2003 Subject: [Mailman-Developers] Serious I/O contention issue In-Reply-To: Message-ID: <000a01c39316$09ac0d60$6501a8c0@Mike> > -----Original Message----- > From: Brad Knowles [mailto:brad.knowles@skynet.be] > Sent: 14 October 2003 23:15 > > Just checking, but have you seen the following FAQ > entries? See: > > > > > > > I figure you probably have already seen them, but I > wanted to be sure. Thanks Brad, yep, I've gone through these. The problem doesn't *appear* to be with Postfix - as I mentioned, there are delays for every operation, and that includes the web interface and commands that don't involve Postfix. Doing an strace on the Qrunner showed that most of the time was being spent reading and writing the whole list from a file on disk, so I think the bottleneck is there rather than with the mail server. One thing I did notice was that disable_dns_lookups=yes is recommended for performance reasons - surely this would stop Postfix from working altogether, as DNS lookups are needed to send mail (I tried switching this option and mail delivery did indeed stop working!) > What about the machine you're doing this on? The filesystem? > How is postfix configured? > > Clearly, there may well be lots of opportunity here to > tune your > filesystem for maximum performance. This is an area that I have shied away from in the past, as our server is managed, so my knowledge of how to tune the filesystem is very sketchy! I would say that the server is quite busy with lots of database accesses, and there have been no noticable filesystem performance problems in the past, no matter how much load I have put on. It also had no problem dealing with virus scanning and bouncing 10 incoming Slapper viruses per second last month while running the rest of the stuff I have. > Depending on what your "SMTP_MAX_RCPTS" value is set > to, I would > imagine that this should be loaded and saved each time a message is > passed from your mailman qrunner to postfix. The higher > SMTP_MAX_RCPTS, the less often this process should occur. Of course, > others have found that SMTP_MAX_RCPTS should typically be set > somewhere between 2 and 10 (usually ~5) for best overall performance > (see the FAQ entries above). The problem is that with a list the size of ours this is causing the system to read and write to the disk almost continuously when more than a few operations of any sort are queued (whether they involve sending mails or not). I mentioned that the qrunners appear to be designed to cache instances of the mlist in memory, but then reload/save to disk every operation regardless. When I commented out the mlist.Load() in the OutgoingRunner inner loop, that particular component zipped along with no performance problems (though my lack of knowledge of Python means that I can't tell if the mlist is shared and marshalled between the qrunners, and thus I don't know if it is valid to skip this reload). If this was indeed a valid thing to do, then it might also be OK to delay Save() operations so that they didn't occur so often (i.e. every N operations, when idle, or on the final release of the mlist). As I said, it APPEARS that the caching of the lists in Runner.py is intended to work this way, but I don't know enough about it to say for sure! Mike From Dan at feld.cvut.cz Wed Oct 15 10:18:27 2003 From: Dan at feld.cvut.cz (Dan Ohnesorg) Date: Wed Oct 15 10:18:32 2003 Subject: [Mailman-Developers] Sending queued messages after a system down Message-ID: <20031015141827.GL1534@ohnesorg.cz> My user has noticed, that Mailman 2.1 sends messages in wrong order, when the messages are sent from queue. My host was down for some time, we have only accepted new messages into mailman queue, mailman process was not running. After starting the mailman with mailmanctl -s start, the newest messages are sent first, the oldiest last. This is not big problem, but can be probably easy fixed. cheers dan -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031015/8fddb6c7/attachment.bin From mlucas at rice.edu Wed Oct 15 11:27:15 2003 From: mlucas at rice.edu (Mike Lucas) Date: Wed Oct 15 11:27:05 2003 Subject: [Mailman-Developers] spam filtering. Message-ID: <3F8D6753.8090600@rice.edu> I found the bounce_matching_headers option under the spam filters. My question is this option lets you hold messeages that meet the filter you put here but what if we want to just have it delete the message automaticlly? We have spam tagging on our mail that puts a tag in in the subject line of ****spam**** and we want to have mailman see that tag and delete the message instead of holding it for moderation. Am I missing a setting? Is this possible? Or do we have to hack the source code? Thanks, Mike From wheakory at isu.edu Wed Oct 15 17:39:43 2003 From: wheakory at isu.edu (Kory Wheatley) Date: Wed Oct 15 17:40:03 2003 Subject: [Mailman-Developers] Bug in mailman 2.1 with unsubscribe Message-ID: <3F8DBE9F.8E1F58A7@isu.edu> I sent this message out once, but know one responded to my message Is there a bug in mailman 2.1, when trying to use the global password I cannot unsubscribe subscribers from a mailing list using an email command. I've tried the following, "I've been able to do this in past verisons with Mailman". I sent a request to "test-request@mm.isu.edu" with the following unsubscribe globalpassword address=emailaccount@isu.edu end Any solution or work around for this problem. -- Kory Wheatley Academic Computing Analyst Sr. Phone 282-3874 ######################################### Everything must point to him. From brad.knowles at skynet.be Wed Oct 15 09:06:44 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 15 17:49:45 2003 Subject: [Mailman-Developers] Re: [Mailman-Users] Zest as an Archiver In-Reply-To: <67440EFC-FF06-11D7-B141-000A957C9A50@openinfo.co.uk> References: <67440EFC-FF06-11D7-B141-000A957C9A50@openinfo.co.uk> Message-ID: At 12:54 PM +0100 2003/10/15, Richard Barrett wrote: > I consider the private/public archive facility of MM/pipermail to be a > 'must have' feature, which must be preserved over archive search. The > ability to change a list from private to public and vice versa without > having to rebuild the archives is important. Today, you can have hidden private lists, but if you know the specific URL to go to, you can sign up for them. It would be nice if we could make the archives hidden but not protected by a password, so that other people could go to the archives and see them, if they were given the proper URL. In "hidden" cases like this, it would also be nice if we could make the URL have a pseudo-random component that would make it more difficult to guess. Right now, if you know the listname (or something close to it) and you know the server, that's all you need. > Must be a lot more things I want but I'll let you prompt for for further > input if you want it. Better handling of foreign languages and MIME, especially non-Roman languages? -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From brad.knowles at skynet.be Wed Oct 15 17:29:16 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 15 17:49:47 2003 Subject: [Mailman-Developers] Serious I/O contention issue In-Reply-To: <000a01c39316$09ac0d60$6501a8c0@Mike> References: <000a01c39316$09ac0d60$6501a8c0@Mike> Message-ID: At 1:15 PM +0100 2003/10/15, Mike Bradley wrote: > One thing I did notice was that disable_dns_lookups=yes is recommended > for performance reasons - surely this would stop Postfix from working > altogether, as DNS lookups are needed to send mail (I tried switching > this option and mail delivery did indeed stop working!) I didn't notice that particular configuration option. I can say that you can use different copies of postfix on the server, listening to different IP address/port combinations, and configure one of them to minimize the various checks that it performs for incoming connections, and then configure mailman to use it instead of the other. Best way to do this would be to have the version that minimizes the lookups listen to port 25 on 127.0.0.1, and have the other copy listen to port 25 on the other IP address(es) on the system. This way, incoming connections are still handled correctly through the external copy of postfix, while outgoing connections are sped up by eliminating unnecessary checks. > This is an area that I have shied away from in the past, as our server > is managed, so my knowledge of how to tune the filesystem is very > sketchy! I would say that the server is quite busy with lots of > database accesses, and there have been no noticable filesystem > performance problems in the past, no matter how much load I have put on. > It also had no problem dealing with virus scanning and bouncing 10 > incoming Slapper viruses per second last month while running the rest of > the stuff I have. Problem is, filesystem problems with synchronous meta-data issues (as typically plague most mail servers) usually don't *appear* to be filesystem problems. Instead, they appear to be things like not having enough RAM -- you get too many programs stacked up in memory, and you page/swap yourself to death. In reality, the problem is that too many processes are bottlenecking on trying to create/delete too many temporary files in a particular directory structure, thus causing them all to slow down and stack up. There are a variety of other ways that filesystem synchronous meta-data issues will manifest themselves, but it's almost always in a manner that would lead you to think of the filesystem last. Of course, the filesystem is one of the first things you should look at in cases like this. This is the reason why I asked the questions regarding the OS, how many disks, how the filesystem is configured, etc.... There are a lot of things you can do to speed these processes up, if you have enough information and know what you're doing. I'm trying to get that additional information. > If this was indeed a valid thing to do, then it might also be OK to > delay Save() operations so that they didn't occur so often (i.e. every N > operations, when idle, or on the final release of the mlist). As I > said, it APPEARS that the caching of the lists in Runner.py is intended > to work this way, but I don't know enough about it to say for sure! I can't speak to the way the Python code runs. I can say that I've got lots of experience doing filesystem tuning for mail systems, and I know that there are a myriads of ways that this kind of problem can masquerade as something else. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From PieterB at gewis.nl Wed Oct 15 18:09:08 2003 From: PieterB at gewis.nl (PieterB) Date: Wed Oct 15 18:09:11 2003 Subject: [Mailman-Developers] spam filtering. In-Reply-To: <3F8D6753.8090600@rice.edu>; from mlucas@rice.edu on Wed, Oct 15, 2003 at 10:27:15AM -0500 References: <3F8D6753.8090600@rice.edu> Message-ID: <20031016000908.A61058@gewis.win.tue.nl> On Wed, Oct 15, 2003 at 10:27:15AM -0500, Mike Lucas wrote: > We have spam tagging on our mail that puts a tag in in > the subject line of ****spam**** and we want to have mailman see that > tag and delete the message instead of holding it for moderation. Am I > missing a setting? Is this possible? Or do we have to hack the source > code? Read http://www.daa.com.au/~james/articles/mailman-spamassassin/ and/or http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq04.023.htp I use a somewhat other approach. I forward all mail to a seperate user id, and then use procmail and spamassassin to filter out spam mail. The procmail script forwards non-spam to mailman. I use James H. spam integration for messages which might be spam and should be moderated by hand. I plan to make and updated version of James Henstridge's spam integration: a) use mailheaders instead of piping the message to spamassassin to SpamAssassin.py b) make 'discard' the default option for messages tagged as 'probably spam'. Regards, Pieter -- There are no winners in life: Only survivors. From jae+python at jerhard.org Wed Oct 15 06:17:14 2003 From: jae+python at jerhard.org (=?ISO-8859-1?Q?J=FCrgen?= A.Erhard) Date: Thu Oct 16 07:25:04 2003 Subject: [Mailman-Developers] Bounce Processing / Membership Disabled comments References: Message-ID: Hi Mailmen, just got such a message from python-list: python-list-request> Your membership in the mailing list Python-list has been disabled due python-list-request> to excessive bounces The last bounce received from you was dated python-list-request> 14-Oct-2003. You will not get any more messages from this list until python-list-request> you re-enable your membership. You will receive 3 more reminders like python-list-request> this before your membership in the list is deleted. Please, give me (us) some way to look at a bounce message. I feel so... powerless. It feels Microsofty... or, worse, Apple-y (sorry Chuq ;-). "You don't need to look at a bounce message, you won't understand it anyway." Well, I think I'd do (or I wouldn't be subscribed to this list ;-) I *wanna* see the (or of the) bounce message! (/me stamps foot on ground) Especially since I seem to get quite a ton of mail on that address without a glitch. (Motivation: I've been unsub'd from most Debian lists a couple months ago when GMX (big German freemailer, an address of which I was subscribed to those lists with) had problems with their DNS: Debian's servers couldn't resolve gmx.net for a couple days (or maybe only hours). In *that* case, smartlist sent a copy of a bounce message, from which I could see the DNS problem clearly) Oh, and looking at the Bounce Processing page: since the admin can modify the bounce processing parameters, the "your sub has been disabled" should at least *hint* at what those parameters are for the list. What I envision: the user's web page (not the notification message, it might become to wordy) has a link under the "excessive bounces" text, giving a page with a) the last bounce (without the body, which is largely irrelevant, I think) b) the parameters of the list c) actual numbers telling how many bounces have been received when. Yes, lots of details, but not too much. Don't know how hard this is to implement... took a peek at the code, but I haven't quite grasped the relation between Bouncer.py and Bouncers/*. Oh, and while I'm at it: the default is 5.0 bounces. The doc is unclear: if there are hard bounces and soft bounces in one day, does the count rise by 1.5, or 1, or 0.5? I'd think it goes up 1 if any hard bounces have been received, and by 0.5 if only soft bounces came in. That right? Looking at Bouncer.py, it seems the first bounce of a day determines the increase. Though... I haven't seen evidence of soft bounces. And, come to think of it, what does bounce_info_stale_after actually do? Does it discard all bounce info (set all counters to 0) when no bounce has happened for 7 days? Which would imply that I can collect 5 bounces over a long period of time, as long as no more than 7 days pass between bounces. Worst case would then be: 5 bounces in a month... holy basscrap, Bassman, tell me that I'm wrong! ;-) Bye, J -- J?rgen A. Erhard Invasion! http://invasion.jerhard.org There's an NDA in the FSF: Free Software FouNDAtion. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031015/373d36d8/attachment.bin From terri at zone12.com Thu Oct 16 12:27:05 2003 From: terri at zone12.com (Terri Oda) Date: Thu Oct 16 12:26:00 2003 Subject: [Mailman-Developers] Bounce Processing / Membership Disabled comments In-Reply-To: References: Message-ID: <20031016162705.GA1024@ostraya.zone12.com> On Wed, Oct 15, 2003 at 12:17:14PM +0200, J?rgen A.Erhard wrote: > Please, give me (us) some way to look at a bounce message. I feel > so... powerless. It feels Microsofty... or, worse, Apple-y (sorry > Chuq ;-). "You don't need to look at a bounce message, you won't > understand it anyway." Well, I think I'd do (or I wouldn't be > subscribed to this list ;-) Ooh, I like this idea. I frequently end up forwarding the triggering bounce notices to my users, and it'd be great if they could look them up themselves. Make sure this idea gets to the wiki so it isn't accidentally forgotten. Which reminds me, I had something else about bounce notices to add there... From marc_news at merlins.org Fri Oct 17 12:27:10 2003 From: marc_news at merlins.org (Marc MERLIN) Date: Fri Oct 17 12:27:24 2003 Subject: [Mailman-Developers] 'empty module name' error and shunting Message-ID: <20031017162710.GA15948@merlins.org> I just wanted to report that one of my mailman servers was also hit by this bug. All mails to the list were stopped until I applied the patch (i.e. the shunted messages weren't the problem, new test messages weren't going through) The bug is described here: http://www.mail-archive.com/mailman-developers@python.org/msg06288.html And this post has the patch: http://www.mail-archive.com/mailman-developers@python.org/msg06317.html After the patch, my list became alive again and all messages were succesfully unshunted. Thanks Nadim. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems & security .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | Finger marc_f@merlins.org for PGP key From barry at python.org Fri Oct 17 12:31:48 2003 From: barry at python.org (Barry Warsaw) Date: Fri Oct 17 12:31:55 2003 Subject: [Mailman-Developers] Re: 'empty module name' error and shunting In-Reply-To: <20031017162710.GA15948@merlins.org> References: <20031017162710.GA15948@merlins.org> Message-ID: <1066408308.18702.118.camel@anthem> On Fri, 2003-10-17 at 12:27, Marc MERLIN wrote: > I just wanted to report that one of my mailman servers was also hit > by this bug. > All mails to the list were stopped until I applied the patch (i.e. the > shunted messages weren't the problem, new test messages weren't going > through) > > The bug is described here: > http://www.mail-archive.com/mailman-developers@python.org/msg06288.html > > And this post has the patch: > http://www.mail-archive.com/mailman-developers@python.org/msg06317.html > > After the patch, my list became alive again and all messages were > succesfully unshunted. Thanks Nadim. Hey Marc, what version of Mailman are you running? I thought I'd fixed that (a different way) in 2.1.3. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031017/cb11c759/attachment.bin From marc_news at merlins.org Fri Oct 17 12:34:50 2003 From: marc_news at merlins.org (Marc MERLIN) Date: Fri Oct 17 12:34:55 2003 Subject: [Mailman-Developers] Re: 'empty module name' error and shunting In-Reply-To: <1066408308.18702.118.camel@anthem> References: <20031017162710.GA15948@merlins.org> <1066408308.18702.118.camel@anthem> Message-ID: <20031017163450.GA16868@merlins.org> On Fri, Oct 17, 2003 at 12:31:48PM -0400, Barry Warsaw wrote: > On Fri, 2003-10-17 at 12:27, Marc MERLIN wrote: > > I just wanted to report that one of my mailman servers was also hit > > by this bug. > > All mails to the list were stopped until I applied the patch (i.e. the > > shunted messages weren't the problem, new test messages weren't going > > through) > > > > The bug is described here: > > http://www.mail-archive.com/mailman-developers@python.org/msg06288.html > > > > And this post has the patch: > > http://www.mail-archive.com/mailman-developers@python.org/msg06317.html > > > > After the patch, my list became alive again and all messages were > > succesfully unshunted. Thanks Nadim. > > Hey Marc, what version of Mailman are you running? I thought I'd fixed > that (a different way) in 2.1.3. Oops, forgot the most important. I upgraded to cvs (2.2a0) and the problem was still there. The patch applied to 2.2a0 and fixed the problem Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems & security .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | Finger marc_f@merlins.org for PGP key -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: not available Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031017/dac32357/attachment.bin From barry at python.org Fri Oct 17 12:41:39 2003 From: barry at python.org (Barry Warsaw) Date: Fri Oct 17 12:41:44 2003 Subject: [Mailman-Developers] Re: 'empty module name' error and shunting In-Reply-To: <20031017163450.GA16868@merlins.org> References: <20031017162710.GA15948@merlins.org> <1066408308.18702.118.camel@anthem> <20031017163450.GA16868@merlins.org> Message-ID: <1066408898.18702.121.camel@anthem> On Fri, 2003-10-17 at 12:34, Marc MERLIN wrote: > Oops, forgot the most important. > I upgraded to cvs (2.2a0) and the problem was still there. > > The patch applied to 2.2a0 and fixed the problem Okay, that's bad. ;) BTW, I should mention that if you're running your production servers from CVS, I /highly/ recommend you move to the Release_2_1-maint branch instead of the HEAD. 2.2a0 (CVS HEAD) will likely see increased instability as I start to work on the code. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031017/74dcfea1/attachment.bin From skip at pobox.com Fri Oct 17 15:15:17 2003 From: skip at pobox.com (Skip Montanaro) Date: Fri Oct 17 15:16:19 2003 Subject: [Mailman-Developers] Mailman/SpamBayes integration? Message-ID: <16272.16325.231011.430782@montanaro.dyndns.org> I seem to recall that one of the original motivations for SpamBayes was to support spam filtering within Mailman (or nearby, if not within). Has anyone taken steps to more tightly integrate the two aside from doing something like running sb_filter.py in front of Mailman's processing? I think SpamBayes is stable enough at this point to consider integrating the SpamBayes classifier and the web interface to the POP3 proxy with Mailman's engine and administrative interface. I know Greg Ward has done some interesting stuff running SpamBayes in front of many of the mailing lists on mail.python.org. While that's a good first step, I find it a bit incomplete because as a list admin. I can't do anything to control how incoming messages are scored. Ideally, I should be able to train the classifier on mails actually sent to (for example) python-help@python.org. I'd consider taking a look at the problem on my own, but I've never even peeked at the Mailman code and don't have any idea where to plug SpamBayes into it. I do know the SpamBayes code to a certain degree and would be happy to work with someone who is familiar with the guts of Mailman to integrate the two tools. Skip From barry at python.org Fri Oct 17 15:19:45 2003 From: barry at python.org (Barry Warsaw) Date: Fri Oct 17 15:19:51 2003 Subject: [Mailman-Developers] Re: [spambayes-dev] Mailman/SpamBayes integration? In-Reply-To: <16272.16325.231011.430782@montanaro.dyndns.org> References: <16272.16325.231011.430782@montanaro.dyndns.org> Message-ID: <1066418385.18702.146.camel@anthem> On Fri, 2003-10-17 at 15:15, Skip Montanaro wrote: > I'd consider taking a look at the problem on my own, but I've never even > peeked at the Mailman code and don't have any idea where to plug SpamBayes > into it. I do know the SpamBayes code to a certain degree and would be > happy to work with someone who is familiar with the guts of Mailman to > integrate the two tools. I have a patch on SF that I wrote back in January to integrate the two. It was mostly a proof-of-concept kind of thing, but would probably serve well as a basis for a real Mailman 2.2 feature. Simone Piunno was doing some more development on the patch recently. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031017/34fa4ee4/attachment.bin From Daniel.Buchmann at bibsys.no Fri Oct 17 16:34:34 2003 From: Daniel.Buchmann at bibsys.no (Daniel Buchmann) Date: Fri Oct 17 16:38:13 2003 Subject: A bug (was: [Mailman-Developers] Re: 'empty module name' error and shunting) In-Reply-To: <1066408898.18702.121.camel@anthem> References: <20031017162710.GA15948@merlins.org> <1066408308.18702.118.camel@anthem> <20031017163450.GA16868@merlins.org> <1066408898.18702.121.camel@anthem> Message-ID: <1066422874.2964.5.camel@fornax.hjemme.bibsys.no> I was just about to report this, so here goes... On Fri, 2003-10-17 at 18:41, Barry Warsaw wrote: > On Fri, 2003-10-17 at 12:34, Marc MERLIN wrote: > > > Oops, forgot the most important. > > I upgraded to cvs (2.2a0) and the problem was still there. > > > > The patch applied to 2.2a0 and fixed the problem > > Okay, that's bad. ;) > > BTW, I should mention that if you're running your production servers > from CVS, I /highly/ recommend you move to the Release_2_1-maint branch Unfortunately, there is a bug when upgrading lists (at least from MM 2.1.2) to the Release_2_1-maint branch. It seems to be caused by the latest SYNC_AFTER_WRITE patch: Upgrading from version 0x20102f0 to 0x20103f0 getting rid of old source files Updating mailing list: kpf Updating the held requests database. - updating old private mbox file - updating old public mbox file Traceback (most recent call last): File "bin/update", line 570, in ? errors = main() File "bin/update", line 447, in main errors = errors + dolist(listname) File "bin/update", line 334, in dolist mlist.Save() File "/home/mailman/Mailman/MailList.py", line 522, in Save self.__save(dict) File "/home/mailman/Mailman/MailList.py", line 481, in __save if mm_cfg.SYNC_AFTER_WRITE: AttributeError: 'module' object has no attribute 'SYNC_AFTER_WRITE' make: *** [update] Error 1 After upgrading to Release_2_1-maint, none of my lists would save, so I had to use the Release_2_1_3 branch instead. :) > instead of the HEAD. 2.2a0 (CVS HEAD) will likely see increased > instability as I start to work on the code. > > -Barry > -Daniel -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 232 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031017/20f1fb48/attachment.bin From barry at python.org Fri Oct 17 16:46:10 2003 From: barry at python.org (Barry Warsaw) Date: Fri Oct 17 16:46:17 2003 Subject: A bug (was: [Mailman-Developers] Re: 'empty module name' error and shunting) In-Reply-To: <1066422874.2964.5.camel@fornax.hjemme.bibsys.no> References: <20031017162710.GA15948@merlins.org> <1066408308.18702.118.camel@anthem> <20031017163450.GA16868@merlins.org> <1066408898.18702.121.camel@anthem> <1066422874.2964.5.camel@fornax.hjemme.bibsys.no> Message-ID: <1066423569.18702.148.camel@anthem> On Fri, 2003-10-17 at 16:34, Daniel Buchmann wrote: > Unfortunately, there is a bug when upgrading lists (at least from MM > 2.1.2) to the Release_2_1-maint branch. > It seems to be caused by the latest SYNC_AFTER_WRITE patch: Whenever a file ending in .in changes (e.g. Defaults.py.in) you must at least re-run config.status to regenerate the right files. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031017/91d5a858/attachment.bin From Daniel.Buchmann at bibsys.no Fri Oct 17 18:47:59 2003 From: Daniel.Buchmann at bibsys.no (Daniel Buchmann) Date: Fri Oct 17 18:52:53 2003 Subject: A bug (was: [Mailman-Developers] Re: 'empty module name' error and shunting) In-Reply-To: <1066423569.18702.148.camel@anthem> References: <20031017162710.GA15948@merlins.org> <20031017163450.GA16868@merlins.org> <1066408898.18702.121.camel@anthem> <1066422874.2964.5.camel@fornax.hjemme.bibsys.no> <1066423569.18702.148.camel@anthem> Message-ID: <1066430879.3013.157.camel@fornax.hjemme.bibsys.no> On Fri, 2003-10-17 at 22:46, Barry Warsaw wrote: > On Fri, 2003-10-17 at 16:34, Daniel Buchmann wrote: > > > Unfortunately, there is a bug when upgrading lists (at least from MM > > 2.1.2) to the Release_2_1-maint branch. > > It seems to be caused by the latest SYNC_AFTER_WRITE patch: > > Whenever a file ending in .in changes (e.g. Defaults.py.in) you must at > least re-run config.status to regenerate the right files. Oh, sorry, of course I have to.. *blush* /me hides in a corner. -Daniel -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 232 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031018/702f215b/attachment.bin From bob at nleaudio.com Sat Oct 18 00:33:58 2003 From: bob at nleaudio.com (Bob Puff@NLE) Date: Sat Oct 18 00:34:01 2003 Subject: [Mailman-Developers] Re: [spambayes-dev] Mailman/SpamBayes integration? In-Reply-To: <1066418385.18702.146.camel@anthem> References: <16272.16325.231011.430782@montanaro.dyndns.org> <1066418385.18702.146.camel@anthem> Message-ID: <20031018043358.M17737@nleaudio.com> That reminds me... I have a couple lists that would benefit from TMDA-fronting them. Is there any info on how to do this (especially with Postfix)? Bob ---------- Original Message ----------- From: Barry Warsaw To: skip@pobox.com Sent: Fri, 17 Oct 2003 15:19:45 -0400 Subject: [Mailman-Developers] Re: [spambayes-dev] Mailman/SpamBayes integration? > On Fri, 2003-10-17 at 15:15, Skip Montanaro wrote: > > > I'd consider taking a look at the problem on my own, but I've never even > > peeked at the Mailman code and don't have any idea where to plug SpamBayes > > into it. I do know the SpamBayes code to a certain degree and would be > > happy to work with someone who is familiar with the guts of Mailman to > > integrate the two tools. > > I have a patch on SF that I wrote back in January to integrate the > two. It was mostly a proof-of-concept kind of thing, but would > probably serve well as a basis for a real Mailman 2.2 feature. > Simone Piunno was doing some more development on the patch recently. > > -Barry ------- End of Original Message ------- From shaikli at yahoo.com Sat Oct 18 18:04:25 2003 From: shaikli at yahoo.com (Nadim Shaikli) Date: Sat Oct 18 18:04:30 2003 Subject: [Mailman-Developers] Re: Bug tracker (was - 'empty module ...) In-Reply-To: <1066408308.18702.118.camel@anthem> Message-ID: <20031018220425.25680.qmail@web14902.mail.yahoo.com> --- Barry Warsaw wrote: > On Fri, 2003-10-17 at 12:27, Marc MERLIN wrote: > > I just wanted to report that one of my mailman servers was also hit > > by this bug. All mails to the list were stopped until I applied the > > patch (i.e. the shunted messages weren't the problem, new test messages > > weren't going through) > > > > The bug is described here: > > http://www.mail-archive.com/mailman-developers@python.org/msg06288.html > > Hey Marc, what version of Mailman are you running? I thought I'd fixed > that (a different way) in 2.1.3. The 'empty module name' bug and patch are noted in Mailman's bug tracking system as #796950. The bugs listed in, http://sourceforge.net/tracker/?group_id=103&atid=100103 are these bugs being looked into and/or inspected. There seem to be a plethora of bugs reported which have not been inspected and/or assigned and/or followedup/commented upon (which is rather unfortunate). Is this bug tracker being actively used by the developers ? - Nadim __________________________________ Do you Yahoo!? The New Yahoo! Shopping - with improved product search http://shopping.yahoo.com From barry at python.org Sat Oct 18 21:42:50 2003 From: barry at python.org (Barry Warsaw) Date: Sat Oct 18 21:43:01 2003 Subject: [Mailman-Developers] Re: Bug tracker (was - 'empty module ...) In-Reply-To: <20031018220425.25680.qmail@web14902.mail.yahoo.com> References: <20031018220425.25680.qmail@web14902.mail.yahoo.com> Message-ID: <1066527770.18702.184.camel@anthem> On Sat, 2003-10-18 at 18:04, Nadim Shaikli wrote: > Is this bug tracker being actively used by the developers ? Yes, or rather, it's a good thing they're all in the trackers. I've been very busy with other things lately, but I'm starting to find some time to get back into Mailman hacking. So having them in the trackers is crucial. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031018/dde9aa0e/attachment.bin From Dan at ohnesorg.cz Wed Oct 15 08:06:37 2003 From: Dan at ohnesorg.cz (Dan Ohnesorg) Date: Sun Oct 19 22:14:39 2003 Subject: [Mailman-Developers] Sending queued messages after a system down Message-ID: <20031015120637.GH1534@ohnesorg.cz> My user has noticed, that Mailman 2.1 sends messages in wrong order, when the messages are sent from queue. My host was down for some time, we have only accepted new messages into mailman queue, mailman process was not running. After starting the mailman with mailmanctl -s start, the newest messages are sent first, the oldiest last. This is not big problem, but can be probably easy fixed. cheers dan -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031015/28da7629/attachment.bin From mike at is.rice.edu Wed Oct 15 11:06:40 2003 From: mike at is.rice.edu (Mike Lucas) Date: Sun Oct 19 22:14:42 2003 Subject: [Mailman-Developers] question about adding lists to a list. Message-ID: <3F8D6280.9070403@is.rice.edu> I have a list that has other lists subcribed to it and these other lists send to more lists, which finnally send to users. My problem is that these are all showing up to be moderated with this message " Blind carbon copies or other implicit destinations are not allowed. Try reposting your message by explicitly including the list address in the To: or Cc: fields. " we have a RE in the accept_these_nonmembers field that lets all nonmembers from our domain send to the lists but when a top level list sends to the other lists it must be using a BCC field. Any info would help a lot. Thanks, Mike From jwblist at olympus.net Mon Oct 20 13:09:47 2003 From: jwblist at olympus.net (John W. Baxter) Date: Mon Oct 20 13:10:04 2003 Subject: [Mailman-Developers] Text for the mass subscribe upload a file option. Message-ID: In Mailman 2.1.1, the mass subscription page has this text under the textbox, introducing the "Choose File" button ....or specify a file to upload: I just talked with a not-dumb list owner who selected an Excel file, producing a wonderful list of 22 non-deletable, non-modifiable entries, which had addresses jumbled together with strings like %0F%00%00. After that conversation, I would suggest that the text read ....or specify a plain text file to upload: Fortunately, the list in question was new, so we elected to blow it away and try again. --John From aaraines at pobox.com Tue Oct 21 11:31:09 2003 From: aaraines at pobox.com (Andrew A. Raines) Date: Tue Oct 21 11:33:13 2003 Subject: [Mailman-Developers] qmail VERP patch changes Message-ID: I'm not sure if Colin still reads this list, but I had to make a couple of changes to the qmail VERP DELIVERY_MODULE on Sourceforge[1]. Namely, the default QVERP_FORMAT didn't make sense, but a docstring addition seemed warranted as well. A patch to Qmail.py is attached. -Drew Footnotes: [1] https://sourceforge.net/tracker/?func=detail&atid=300103&aid=645513&group_id=103 -------------- next part -------------- --- /tmp/Qmail-bad.py 2003-10-21 10:00:40.000000000 -0500 +++ /tmp/Qmail.py 2003-10-21 10:20:34.000000000 -0500 @@ -25,6 +25,8 @@ Set QMAIL_CMD = '/var/qmail/bin/qmail-inject' and DO_QMAIL_VERP = 1 in mm_cfg.py to enable this behaviour. You can also set QVERP_FORMAT to change the format of the VERP header before it's passed to qmail for interpolation. +You have to set QVERP_FORMAT to something; if you don't want to change the +default VERP format, set it to None. qmail-inject is unfortunatly more sensitive than Mailman about the format of messages passed to it, and will sometimes refuse to deliver one. If @@ -138,7 +140,7 @@ if mm_cfg.QVERP_FORMAT: qverp_format = mm_cfg.QVERP_FORMAT else: - qverp_format = '%(bounces)s-%(listhost)s-@[]' + qverp_format = '%(bounces)s-@%(listhost)s-@[]' bmailbox, bdomain = Utils.ParseEmail(envsender) d = {'bounces' : bmailbox, 'listhost': DOT.join(bdomain), From jon at latchkey.com Tue Oct 21 18:22:56 2003 From: jon at latchkey.com (Jon Scott Stevens) Date: Tue Oct 21 18:23:27 2003 Subject: [Mailman-Developers] Bug in mailman 2.1.3 Message-ID: Hello everyone. I just got the stack trace below from one of my list admins...not sure how he screwed things up...but I figured I would let you know...based on the traceback, it seems that he somehow got a strange character into the list of members... [share] 3:16pm ~ > ./bin/check_db -v sfindie-fence List: sfindie-fence /usr/local/mailman/lists/sfindie-fence/config.pck: okay /usr/local/mailman/lists/sfindie-fence/config.pck.last: okay /usr/local/mailman/lists/sfindie-fence/config.db: okay /usr/local/mailman/lists/sfindie-fence/config.db.last: okay [share] 3:17pm ~ > ./bin/list_members -u sfindie-fence [no output] [share] 3:18pm ~ > ./bin/list_members -i sfindie-fence ?knights_bishop@yahoo.com [share] 3:18pm ~ > ./bin/list_members -i sfindie-fence > bad.txt [share] 3:19pm ~ > ./bin/remove_members -f bad.txt sfindie-fence Yup...removing that address solved the problem. Note: this list admin only has access to the web interface. So, somehow he was able to send corrupt data to the server which caused this problem. Probably not a good thing. =) --------------------------------------------------------------------------- Bug in Mailman version 2.1.3 We're sorry, we hit a bug! If you would like to help us identify the problem, please email a copy of this page to the webmaster for this site with a description of what happened. Thanks! Traceback: Traceback (most recent call last): File "/usr/local/mailman/scripts/driver", line 87, in run_main main() File "/usr/local/mailman/Mailman/Cgi/admin.py", line 192, in main show_results(mlist, doc, category, subcat, cgidata) File "/usr/local/mailman/Mailman/Cgi/admin.py", line 491, in show_results form.AddItem(membership_options(mlist, subcat, cgidata, doc, form)) File "/usr/local/mailman/Mailman/Cgi/admin.py", line 799, in membership_options all = [_m.encode() for _m in mlist.getMembers()] UnicodeError: ASCII decoding error: ordinal not in range(128) From john at momsview.com Tue Oct 21 21:02:27 2003 From: john at momsview.com (John) Date: Tue Oct 21 21:02:38 2003 Subject: [Mailman-Developers] Bounce processing observations 2.1.3 versus 2.1.1 for large lists Message-ID: <03c001c39838$2da73830$0201a8c0@daddy> Greetings: I currently manage a mailman installation on a Redhat system with a total of 229,000 subs spread fairly evenly over 72 announce only lists (averaging 3k users each) . Recently I upgraded from 2.1.1 to 2.1.3 primarily because of the fix for the cross site scripting bug but also for the bounce processing improvements. Prior to the update I would only run BounceRunner every 8 hours because of the large CPU and I/O load it would put on my system (90% CPU, and LOTS of disk I/O). So far it appears that the 2.1.3 bounce processing software is MUCH faster. In many cases it's able to process up to 15 bounces per SECOND. Fantastic. That is versus 6 bounces per second on 2.1.1 (This is a 2GHZ P4 with ATA100 IDE drives) However, with this increased performance (no doubt due to the fact that BounceRunner registers MANY bounces for one list at time) comes a problem best illustrated by the following excerpt from my bounce log: Oct 20 12:29:54 2003 (3706) Processing 1211 queued bounces Oct 20 12:33:04 2003 (3706) bouncingsubscriber1@isp.com: momsviewf current bounce score: 5.0 ..... dozens and dozens more entries for the momsviewf list ..... Oct 20 12:36:10 2003 (3706) bouncingsubscriber2@isp2.com: momsviewf current bounce score: 2.0 If I understand how the processing is done correctly, was the momsviewf list indeed locked for a period of 3 minutes and 6 seconds? I noticed also that I could not access the momsviewf list via the web admin interface, it would hang on the admin password entry screen. The momsviewf list has 4374 subs. I did not notice this issue on 2.1.1. Probably since it releases the list lock after every bounce. While increasing the performance of bounce processing significantly, the changes in 2.1.3 appear to have created a lock contention issue as a side effect. My suggested fix, as I previously mentioned in my Jan 30, 2003 posting to this list reproduced below is to purposely LIMIT the number of bounces processed (preferably per list) so that the lock is released in a reasonable time period. I would be interested in hearing other suggestions or observations about this issue. Thanks John Co-webmaster momsview.com ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- intialize x to number of bounces to process on each pass While Forever Initialize Python list structure to hold bounces (Process x emails in the bounce queue) For x emails in queue Dequeue the message Extract addresses to bounce SAVE address and Listname in Python list structure If Python List structure contains emails For all mailing lists in Python structure REREAD list from disk LOCK the LIST For all addresses that bounced for this list Register Bounce SAVE the list to disk UNLOCK the list SLEEP for SLEEPTIME CLEANUP on exit Advantages to this method: (1) We process a number of bounces before writing out the list reducing I/O (the real bootleneck) by factor x. When x is one the algorithm almost degenerates to the current method (2) Since we always sleep on each pass it gives other processes (like the Web gui) a chance to read the list. (3) By increasing x we control the number of bounces that get processed on each pass. The time it takes to extract the addresses gives other processes time to acquire the list lock and avoid "lockout" (4) Since "in memory" bounce registration is very fast we can do a lot of them while the list is locked without adding significantly to the already long lock time on a big list (I believe the I/O is the limiting factor) From lrosa at mail.hypertrek.info Wed Oct 22 00:31:37 2003 From: lrosa at mail.hypertrek.info (Luigi Rosa) Date: Wed Oct 22 00:31:38 2003 Subject: [Mailman-Developers] Archive list by date without the date field Message-ID: <30338289062.20031022063137@mail.hypertrek.info> Hello, selecting the date sort in the archive list, the message list that appears does not show the date field. If I am searching a message with a known date, I have to open some messages to "guess" the date. I think that the archive message list should contain the date, at least if the user selects to sort it by date. -- Best regards, Luigi From chuqui at plaidworks.com Wed Oct 22 00:35:42 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Wed Oct 22 00:36:44 2003 Subject: [Mailman-Developers] oh, sigh.. It's back. Message-ID: <3235F80A-0449-11D8-8CCB-0003934516A8@plaidworks.com> Hey, barry? remember this? well, I just did a CVS update to take my system to 2.1.3, and... It's back. make[1]: Nothing to be done for `all'. /usr/bin/python ../build/bin/msgfmt.py -o cs/LC_MESSAGES/mailman.mo cs/LC_MESSAGES/mailman.po Merging new template file with existing translations msgmerge -U da/LC_MESSAGES/mailman.po mailman.pot msgmerge: invalid option -- U Try `msgmerge --help' for more information. make[1]: *** [da/LC_MESSAGES/mailman.po] Error 1 make[1]: Nothing to be done for `all'. 7.050u 4.120s 0:11.85 94.2% 0+0k 0+27io 0pf+0w this is after: 232 21:19 cvs -q up -d -P -A 239 21:32 make distclean 240 21:32 ./configure --with-mail-gid=daemon 241 21:33 make From davidb at chelsea.net Wed Oct 22 08:50:14 2003 From: davidb at chelsea.net (David Birnbaum) Date: Wed Oct 22 08:50:18 2003 Subject: [Mailman-Developers] Welcome Message Override In-Reply-To: References: Message-ID: Folks, I have a user who wants to be able to replace the standard welcome message in it's entirety - some of the standard header sent out by Mailman doesn't apply in this case. I found these references: http://mail.python.org/pipermail/mailman-users/2002-April/018955.html http://mail.python.org/pipermail/mailman-users/2003-June/029676.html But I was wondering if a more "standard" method was around, or if it had been integrated more tightly recently. Thanks, David. From colinp at waikato.ac.nz Wed Oct 22 16:13:10 2003 From: colinp at waikato.ac.nz (Colin Palmer) Date: Wed Oct 22 16:13:18 2003 Subject: [Mailman-Developers] qmail VERP patch changes In-Reply-To: References: Message-ID: <1066853590.14170.77.camel@firefox.cc.waikato.ac.nz> On Wed, 2003-10-22 at 04:31, Andrew A. Raines wrote: > I'm not sure if Colin still reads this list, but I had to make a > couple of changes to the qmail VERP DELIVERY_MODULE on > Sourceforge[1]. Namely, the default QVERP_FORMAT didn't make > sense, but a docstring addition seemed warranted as well. A > patch to Qmail.py is attached. Thanks! I'll include that in the next version. -- Colin Palmer University of Waikato, ITS Division From tanner at real-time.com Wed Oct 22 16:58:08 2003 From: tanner at real-time.com (Bob Tanner) Date: Wed Oct 22 16:58:46 2003 Subject: [Mailman-Developers] ImportError: No module named korean revisted Message-ID: <200310221558.08646@Twin.Cities.Linux.Users.Group-www.mn-linux.org> % cat /etc/redhat-release Red Hat Linux release 7.3 (Valhalla) $ rpm -qa | grep python python-popt-0.8.8-7.x.2 ---->python2-devel-2.2.2-11.7.3 python-clap-1.0.0-3 python-xmlrpc-1.5.1-7.x.3 rpm-python-4.0.4-7x.18 python-devel-1.5.2-43.73 python-1.5.2-43.73 ---->python2-2.2.2-11.7.3 $ pwd /usr/src/redhat/BUILD/mailman-2.1.3 $ make install Compiling /var/tmp/ZZZZ/Mailman/versions.py ... Traceback (most recent call last): File "bin/update", line 45, in ? import paths File "bin/paths.py", line 59, in ? import korean ImportError: No module named korean make: *** [update] Error 1 I followed the thread here: http://mail.python.org/pipermail/mailman-users/2002-December/024814.html It looks like this issues was "fixed", but I'm still getting it on a RH73 build. I have -devel packages installed. Do I need a "newer" version of python? -- Bob Tanner | Phone : (952)943-8700 http://www.mn-linux.org, Minnesota, Linux | Fax : (952)943-8500 Key fingerprint = AB15 0BDF BCDE 4369 5B42 1973 7CF1 A709 2CC1 B288 From gsstark at mit.edu Thu Oct 23 10:20:37 2003 From: gsstark at mit.edu (Greg Stark) Date: Thu Oct 23 10:21:01 2003 Subject: [Mailman-Developers] Re: Bounce removal parameters default values In-Reply-To: <87fzijv50a.fsf@athene.jamux.com> References: <871y32qzae.fsf@stark.dyndns.tv> <87eky6kwlo.fsf@stark.dyndns.tv> <87isnfh817.fsf@stark.dyndns.tv> <1064591505.30783.28.camel@anthem> <87d6dnh51h.fsf@stark.dyndns.tv> <87fzijv50a.fsf@athene.jamux.com> Message-ID: <87d6co2fyi.fsf@stark.dyndns.tv> "John A. Martin" writes: > If any mail is rejected or bounced (ie, initially accepted for > delivery but later a DSN is returned indicating a delivery failure) > then that is a delivery failure. If you do not like what your > receiving mail systems reject or bounce that is not a Mailman problem. I like very much that the mail systems reject virus and worm mails. I don't like that mailman extrapolates from that failure to assuming the mailbox is broken and it should unsubscribe it. That's bogus. Mailman should not take any such drastic action purely on the basis of a bounce from a message with content it didn't control. It has no idea *why* the message bounced and no idea whether it means future messages will bounce or not. -- greg From Dale at Newfield.org Thu Oct 23 10:38:22 2003 From: Dale at Newfield.org (Dale Newfield) Date: Thu Oct 23 10:38:25 2003 Subject: [Mailman-Developers] Re: Bounce removal parameters default values In-Reply-To: <87d6co2fyi.fsf@stark.dyndns.tv> References: <871y32qzae.fsf@stark.dyndns.tv> <87eky6kwlo.fsf@stark.dyndns.tv> <87isnfh817.fsf@stark.dyndns.tv> <1064591505.30783.28.camel@anthem> <87d6dnh51h.fsf@stark.dyndns.tv> <87fzijv50a.fsf@athene.jamux.com> <87d6co2fyi.fsf@stark.dyndns.tv> Message-ID: On Thu, 23 Oct 2003, Greg Stark wrote: > I like very much that the mail systems reject virus and worm mails. That's silly. You should instead like very much that mail clients weren't susceptible to such things and the delivery mechanism didn't have to coddle the mail clients. > Mailman should not take any such drastic action purely on the basis of a > bounce That's the whole point of bounce processing. A bounce signifies an invalid email address. If you don't want bounces to ever cause people to be removed from your mailing lists, turn off bounce processing. If you don't want real messages to get bounced, encourage people to use mail clients that aren't so full of holes that the host mail system needs to cause valid email addresses to bounce. --- Dale Newfield "They that can give up essential liberty to obtain a little safety deserve neither liberty nor safety." - Benjamin Franklin, on the Statue of Liberty From barry at python.org Thu Oct 23 11:48:16 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 23 11:48:22 2003 Subject: [Mailman-Developers] Re: Bounce removal parameters default values In-Reply-To: <87d6co2fyi.fsf@stark.dyndns.tv> References: <871y32qzae.fsf@stark.dyndns.tv> <87eky6kwlo.fsf@stark.dyndns.tv> <87isnfh817.fsf@stark.dyndns.tv> <1064591505.30783.28.camel@anthem> <87d6dnh51h.fsf@stark.dyndns.tv> <87fzijv50a.fsf@athene.jamux.com> <87d6co2fyi.fsf@stark.dyndns.tv> Message-ID: <1066924096.11634.144.camel@anthem> On Thu, 2003-10-23 at 10:20, Greg Stark wrote: > "John A. Martin" writes: > > > If any mail is rejected or bounced (ie, initially accepted for > > delivery but later a DSN is returned indicating a delivery failure) > > then that is a delivery failure. If you do not like what your > > receiving mail systems reject or bounce that is not a Mailman problem. > > I like very much that the mail systems reject virus and worm mails. I don't > like that mailman extrapolates from that failure to assuming the mailbox is > broken and it should unsubscribe it. That's bogus. > > Mailman should not take any such drastic action purely on the basis of a > bounce from a message with content it didn't control. It has no idea *why* the > message bounced and no idea whether it means future messages will bounce or > not. I've been swamped, but I'll just quickly chime in that we've seen lots of unintentional unsubs since moving python-list over to Mailman 2.1.3. Unintentional means that the person's mailbox is still valid, and they still want to be on the list, but they got disabled without understanding why. I consider it important to fix this for 2.1.4, although I haven't decided how yet. One thing will be to include a bounce example with re-enable notifications. A second thing may be to send probes when the bounce threshold has been reached, but I need to think more about the exact machinery for that and whether that's appropriate for a patch release or not. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031023/f5e4b122/attachment.bin From gsstark at mit.edu Thu Oct 23 13:41:49 2003 From: gsstark at mit.edu (Greg Stark) Date: Thu Oct 23 13:42:01 2003 Subject: [Mailman-Developers] Re: Bounce removal parameters default values In-Reply-To: References: <871y32qzae.fsf@stark.dyndns.tv> <87eky6kwlo.fsf@stark.dyndns.tv> <87isnfh817.fsf@stark.dyndns.tv> <1064591505.30783.28.camel@anthem> <87d6dnh51h.fsf@stark.dyndns.tv> <87fzijv50a.fsf@athene.jamux.com> <87d6co2fyi.fsf@stark.dyndns.tv> Message-ID: <871xt33l7m.fsf@stark.dyndns.tv> Dale Newfield writes: > On Thu, 23 Oct 2003, Greg Stark wrote: > > I like very much that the mail systems reject virus and worm mails. > > That's silly. You should instead like very much that mail clients weren't > susceptible to such things and the delivery mechanism didn't have to > coddle the mail clients. > > > Mailman should not take any such drastic action purely on the basis of a > > bounce > > That's the whole point of bounce processing. A bounce signifies an > invalid email address. No, bounces can mean various things. Anything from an overfull mailbox, to a message that is too large or otherwise unacceptable. It could be a temporary situation, or it could be because of the particular message. In any case trusting a message provided from an outside source to serve as a valid test violates security principles. What if i find a message that causes postfix to core dump? I can send it to the mailing list a few times in a row and cause every subscriber of yours using postfix to be unsubscribed from your mailing list. > If you don't want bounces to ever cause people to be removed from your > mailing lists, turn off bounce processing. I'm not the list admin, I'm a poor hapless list subscriber. I get unsubscribed from mailman mailing lists every few months due to this behaviour. I don't get unsubscribed from ezmlm lists because (as much as I dislike qmail and ezmlm in general) this is one thing it gets right. If ezmlm notices bounces of list messages it doesn't just unsubscribe you summarily, it sends a message of its own with known content and format and only unsubscribes you if that bounces. In fact it does a second iteration of that, which is a good idea but doesn't really seem necessary. In fact if it weren't for ezmlm's handling of this I would never have figured out why I kept getting dropped from mailman lists. I would have always just assumed it as a bug with mailman. > If you don't want real messages to get bounced No real messages to me have ever been bounced to my knowledge. > encourage people to use mail clients that aren't so full of holes that the > host mail system needs to cause valid email addresses to bounce. I would love an option to mailman to refuse subscriptions from a list of blacklisted MUAs. I would recommend some lists exclude Outlook on security concerns. It wouldn't reduce the need for proper safe bounce handling. Trusting bounces to messages of unknown content is simply unsafe. -- greg From jarrell at vt.edu Fri Oct 24 14:41:37 2003 From: jarrell at vt.edu (Ron Jarrell) Date: Fri Oct 24 14:43:26 2003 Subject: [Mailman-Developers] oh, sigh.. It's back. In-Reply-To: <3235F80A-0449-11D8-8CCB-0003934516A8@plaidworks.com> References: <3235F80A-0449-11D8-8CCB-0003934516A8@plaidworks.com> Message-ID: <6.0.0.22.2.20031024144050.0312e788@lennier.cc.vt.edu> Chuq, see the message I posted 10/8. :-) Your gnu gettext utils are out of date; Barry's using a brand new argument, the -U.. I was running 0.10.35, after I upgraded to 0.12.1 I could build again. At 12:35 AM 10/22/2003, Chuq Von Rospach wrote: >Hey, barry? remember this? >015157.html> > >well, I just did a CVS update to take my system to 2.1.3, and... > >It's back. > >make[1]: Nothing to be done for `all'. >/usr/bin/python ../build/bin/msgfmt.py -o cs/LC_MESSAGES/mailman.mo >cs/LC_MESSAGES/mailman.po >Merging new template file with existing translations >msgmerge -U da/LC_MESSAGES/mailman.po mailman.pot >msgmerge: invalid option -- U >Try `msgmerge --help' for more information. >make[1]: *** [da/LC_MESSAGES/mailman.po] Error 1 >make[1]: Nothing to be done for `all'. >7.050u 4.120s 0:11.85 94.2% 0+0k 0+27io 0pf+0w > >this is after: > > 232 21:19 cvs -q up -d -P -A > 239 21:32 make distclean > 240 21:32 ./configure --with-mail-gid=daemon > 241 21:33 make > > >_______________________________________________ >Mailman-Developers mailing list >Mailman-Developers@python.org >http://mail.python.org/mailman/listinfo/mailman-developers From john at momsview.com Fri Oct 24 20:14:50 2003 From: john at momsview.com (John) Date: Fri Oct 24 20:14:59 2003 Subject: [Mailman-Developers] Memory resource issue with new 2.1.3 BounceRunner Message-ID: <098f01c39a8d$024a5d00$0201a8c0@daddy> While testing 2.1.3 on a recent mailing I discovered that the BounceRunner process had grown to 400 megabytes! This recent mailing generated a very large number of bounces due to a problem I was having with aol.com The new BounceRunner pre-processes all bounces by building an in memory dictionary with the bounce information. Unfortunately there are no limits placed on the number of bounces that are added to this dictionary. In this case, I received over 10,000 bounces which required over 400 megabytes of RAM to hold. Needless to say my machine slowed to a crawl and continued to add to the structure since it couldn't empty the qfiles/bounces directory fast enough. I think we need a limit on the number of bounces added to this in memory structure regardless of the number of bounces in qfiles/bounces. John Co webmaster momsview.com From john at momsview.com Fri Oct 24 21:40:33 2003 From: john at momsview.com (John) Date: Fri Oct 24 21:40:55 2003 Subject: [Mailman-Developers] ImportError: No module named korean revisted Message-ID: <09d501c39a98$ff6fa840$0201a8c0@daddy> This probably is a mailman-users issue...but Redhat calls the python v2.2 executable python2 so you need the configure parameter like this make clean ./configure --with-python=/usr/bin/python2 make install From i.bapty at student.umist.ac.uk Mon Oct 27 07:00:57 2003 From: i.bapty at student.umist.ac.uk (Iain Bapty) Date: Mon Oct 27 07:00:45 2003 Subject: [Mailman-Developers] Requirements for a new archiver Message-ID: <3F9D08F9.6090209@student.umist.ac.uk> Hi, For those of you that don't know I am currently working on a archive component for Mailman as part of my degree. The interface to the archive shall be based on the ideas in Ka-Ping Yee's paper on his Zest prototype. Over the past two weeks I have been looking at requirements and have the following. These are in no specific order. Due to the time constraints on my project (I am to only spend 200 hours in total on it, including writing reports, presentations etc) there is a limit to the amount I can do. Functional Requirements The archive component should 1. store email discussions. 2. integrate with Mailman. 3. provide a web-based interface to those email-discussions. 4. provide an interface that threads discussions by their content. (ZEST) 5. provide an interface that threads discussions by e-mail replies. 6. allow for full-text searching of the archives. 7. allow for filtering by date, author, and/or topic. 8. be MIME aware. 9. allow archives to be set as public or private. 10. allow posts to be added, deleted, and modified through web interface. 11. allow archives to be locked to prevent modification. 12. allow postings to be emailed. 13. allow postings to be referenced externally. Non-Functional Requirements 1. Maintainable 2. Secure 3. Scalable The minimum I am planning on doing is the first 5 functional requirements restricted by the first 2 non-functional requirements. There are a two reasons I am posting this. Is there anything obvious that I have missed? Which of the functional requirements, 6 to 13, do you feel are the most important? (As part of my report I have to analyse the requirements captured) Any feedback is very much appreciated. Thanks in advance. Iain From Dale at Newfield.org Mon Oct 27 09:15:56 2003 From: Dale at Newfield.org (Dale Newfield) Date: Mon Oct 27 09:16:19 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <3F9D08F9.6090209@student.umist.ac.uk> References: <3F9D08F9.6090209@student.umist.ac.uk> Message-ID: On Mon, 27 Oct 2003, Iain Bapty wrote: > 6. allow for full-text searching of the archives. > 7. allow for filtering by date, author, and/or topic. > The minimum I am planning on doing is the first 5 functional > requirements > Which of the functional requirements, 6 to 13, do you feel are the most > important? (As part of my report I have to analyse the requirements > captured) I find any archiver without at least 6 and likely 7 to be unusable, and an incredible waste of the user's time. -Dale From gaclark at attglobal.net Mon Oct 27 09:29:34 2003 From: gaclark at attglobal.net (Gregory Clark) Date: Mon Oct 27 09:30:01 2003 Subject: [Mailman-Developers] Customize the Layout of Archives Message-ID: <001301c39c96$c1500460$7708410c@us.oracle.com> I have the following at my current web hosting account: MailMan Version 2.1.2 Pipermail 0.09 cPanel Version 7.4.2 MySQL Version 4.0.15 PHP Version 4.3.3 What do I need to do to control the look and feel of the Archives? Specifically, I have private archival set on. I want to remove or suppress the link to download the full raw archive. What do I need to to? Regards, Greg ______________________________________________________________ Gregory A. Clark Email: gaclark@attglobal.net ______________________________________________________________ The information contained in this communication is confidential and may be legally privileged. It is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or taking any action in reliance of the contents of this information is strictly prohibited and may be unlawful. Gregory A. Clark is neither liable for the proper/complete transmission of the information contained in this communication nor for any delay in its receipt. From amk at amk.ca Mon Oct 27 09:38:43 2003 From: amk at amk.ca (amk@amk.ca) Date: Mon Oct 27 09:41:58 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <3F9D08F9.6090209@student.umist.ac.uk> References: <3F9D08F9.6090209@student.umist.ac.uk> Message-ID: <20031027143843.GB28815@rogue.amk.ca> On Mon, Oct 27, 2003 at 12:00:57PM +0000, Iain Bapty wrote: > Any feedback is very much appreciated. Thanks in advance. My requirement list is at http://www.amk.ca/ng-arch/ArchiverRequirements . The ng-arch code is incomplete, but if it would be helpful (and you're allowed to use it), let me know and I can send you a copy. BTW, feel free to use either the ng-arch Wiki or mailing list for purposes related to your project; both are pretty quiet, and your project is certainly on-topic for both. If you use the Wiki, just don't make extensive edits to any of my pages and create your own new pages, in case I dust off the project. --amk From mcicogni at siosistemi.it Mon Oct 27 13:00:49 2003 From: mcicogni at siosistemi.it (Mauro Cicognini) Date: Mon Oct 27 13:03:37 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: References: Message-ID: <3F9D5D51.1080501@siosistemi.it> Iain wrote: > 5. provide an interface that threads discussions by e-mail replies. We might already have some Python code that does this (albeit in a format quite different from Unix mailbox), based on message-IDs. Do you think you may be interested in it? > 8. be MIME aware. I would strongly suggest that not being MIME-aware would strictly confine your work to proof-of-concept status. I.e., it would scarcely be useful in today's world. Even a simple HTML message would break it... and we know there's pleny of those. BR, Mauro From kmccann at bellanet.org Mon Oct 27 15:06:30 2003 From: kmccann at bellanet.org (Kevin McCann) Date: Mon Oct 27 15:06:56 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <3F9D08F9.6090209@student.umist.ac.uk> References: <3F9D08F9.6090209@student.umist.ac.uk> Message-ID: <1067285190.16085.79.camel@localhost.localdomain> Iain said: > 1. store email discussions. Iain, To me, this is the single most important part. How do you intend to store the messages? Maybe others don't give a fig but I think that if archived messages were to be stored in an easy-to-access database then life would be good. All of the wonderful things that people want to do with message data would be easy. Which is why I'm looking at using the Mail::Box Perl package to either read Pipermail mbox files or parse the messages from stdin via a dummy subscriber and alias. Either way, get the message parts into a widely implemented and simple-to-build-web-apps-with DB (my choice is MySQL). I was thinking about using MHonarc to enhance the archive experience but it doesn't work with MySQL directly so Mail::Box just might be what the doctor ordered. Maybe this direction is outside of what your scope is, but I'd still be interested in how you intend to store messages. Thanks, Kevin From barry at python.org Mon Oct 27 15:12:29 2003 From: barry at python.org (Barry Warsaw) Date: Mon Oct 27 15:12:36 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067285190.16085.79.camel@localhost.localdomain> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> Message-ID: <1067285548.1785.62.camel@anthem> On Mon, 2003-10-27 at 15:06, Kevin McCann wrote: > To me, this is the single most important part. How do you intend to > store the messages? > > Maybe others don't give a fig but I think that if archived messages were > to be stored in an easy-to-access database then life would be good. I agree, although I don't know if I'd store everything in MySQL. There are a couple of ways I could see slicing things. You could store one message per file a la MH, with some elaboration to avoid inode exhaustion. Or you could store everything in an mbox file with a file offset index. Or perhaps store everything to an nntp server (Twisted would make a nice platform for this ). What would then be in the database would be records providing easy lookup by message-id (at least) into the on-disk message store. Also, I really want the next generation archiver to do everything through cgi (or equivalent programmatic interface). The ability to massage the messages on the way out to me outweighs the benefits of vending messages directly from the file system. -Barry From kmccann at bellanet.org Mon Oct 27 16:37:57 2003 From: kmccann at bellanet.org (Kevin McCann) Date: Mon Oct 27 16:38:03 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067285548.1785.62.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> Message-ID: <1067290677.16085.107.camel@localhost.localdomain> On Mon, 2003-10-27 at 15:12, Barry Warsaw wrote: > On Mon, 2003-10-27 at 15:06, Kevin McCann wrote: > > > To me, this is the single most important part. How do you intend to > > store the messages? > > > > Maybe others don't give a fig but I think that if archived messages were > > to be stored in an easy-to-access database then life would be good. > > I agree, although I don't know if I'd store everything in MySQL. I'd love to have these database fields in a messages table at my disposal: id (unique to system, not message-id) listname subject date from body message-id references mime_headers This would make it very easy to build useful and flexible web apps. The need is there. I can smell it. ;-) Bottom line, the easier you make access to all of the little bits of a message that are important in one way or another, the more widespread development will be. And the faster we'll see really, really cool mailing list-focused web apps that foster communication, collaboration and community building, all for the betterment of mankind. :-) - Kevin > > There are a couple of ways I could see slicing things. You could store > one message per file a la MH, with some elaboration to avoid inode > exhaustion. Or you could store everything in an mbox file with a file > offset index. Or perhaps store everything to an nntp server (Twisted > would make a nice platform for this ). > > What would then be in the database would be records providing easy > lookup by message-id (at least) into the on-disk message store. > > Also, I really want the next generation archiver to do everything > through cgi (or equivalent programmatic interface). The ability to > massage the messages on the way out to me outweighs the benefits of > vending messages directly from the file system. > > -Barry > > From i.bapty at student.umist.ac.uk Mon Oct 27 16:44:25 2003 From: i.bapty at student.umist.ac.uk (Iain Bapty) Date: Mon Oct 27 16:44:19 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067285548.1785.62.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> Message-ID: <3F9D91B9.2090109@student.umist.ac.uk> Barry Warsaw wrote: >On Mon, 2003-10-27 at 15:06, Kevin McCann wrote: > >>To me, this is the single most important part. How do you intend to >>store the messages? >> >> Undecided, I am only just starting the development stage now (overlapping with the end of my requirements). This is a decision I will have to make over the next two weeks and as I am relatively inexperienced I shall be asking a lot of questions and doing lots of research. I included it as a requirement, even though it is an obvious one, so I can relate my design directly back to each requirement. >>Maybe others don't give a fig but I think that if archived messages were >>to be stored in an easy-to-access database then life would be good. >> >> > >I agree, although I don't know if I'd store everything in MySQL. > > I have to explore as many of the options as time permits for my report. Although I like the idea of being able to do an SQL style query based on header information which would be stored as seperate fields. >There are a couple of ways I could see slicing things. You could store >one message per file a la MH, with some elaboration to avoid inode >exhaustion. Or you could store everything in an mbox file with a file >offset index. Or perhaps store everything to an nntp server (Twisted >would make a nice platform for this ). > > Twisted eh? I will have to look into that. >Also, I really want the next generation archiver to do everything >through cgi (or equivalent programmatic interface). The ability to >massage the messages on the way out to me outweighs the benefits of >vending messages directly from the file system. > This is where my ignorance shines, could you elaborate a bit on this part please? By this, do you mean you want all queries to be setup and executed by a user through the web interface? Why can't messages be massaged from the file system? Thanks Iain From barry at python.org Mon Oct 27 16:54:39 2003 From: barry at python.org (Barry Warsaw) Date: Mon Oct 27 16:54:50 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067290677.16085.107.camel@localhost.localdomain> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <1067290677.16085.107.camel@localhost.localdomain> Message-ID: <1067291678.1785.78.camel@anthem> On Mon, 2003-10-27 at 16:37, Kevin McCann wrote: > I'd love to have these database fields in a messages table at my > disposal: > > id (unique to system, not message-id) How do we calculate this? It probably ought to be globally unique, or at least locally unique to a Mailman installation. (Then again, what happens if you move a list?) It probably also shouldn't have any usable semantics -- i.e. be just an identifier. Maybe just a counter such as "124.mailman-developers.python.org" > listname > subject > date > from > body This is the part I'm uncertain about. Is it better to store the body in the table, or on disk, with an index pointer in the table? I was speaking with Andrew Koenig about something similar, and he said he had a very fast algorithm for finding a message in an mbox file given its message id. > message-id Which reminds me, I still want to revisit the "does Mailman have the right to mess with the Message-ID" issue. > references > mime_headers Why not all the headers? -Barry From barry at python.org Mon Oct 27 17:03:38 2003 From: barry at python.org (Barry Warsaw) Date: Mon Oct 27 17:03:45 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <3F9D91B9.2090109@student.umist.ac.uk> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <3F9D91B9.2090109@student.umist.ac.uk> Message-ID: <1067292218.1785.88.camel@anthem> On Mon, 2003-10-27 at 16:44, Iain Bapty wrote: > Twisted eh? I will have to look into that. Indeed. I'm using it in my Mailman3 experiments, and I think while Twisted is a big package, it gives us a lot of bang for the buck. > >Also, I really want the next generation archiver to do everything > >through cgi (or equivalent programmatic interface). The ability to > >massage the messages on the way out to me outweighs the benefits of > >vending messages directly from the file system. > > > This is where my ignorance shines, could you elaborate a bit on this > part please? By this, do you mean you want all queries to be setup and > executed by a user through the web interface? Why can't messages be > massaged from the file system? In MM2 we made the conscious decision that public archives should be vended from the file system. That's why when you read the archives of this list through http://mail.python.org/pipermail/mailman-developers, an Alias directive maps that directly to a file on the file system. We were primarily concerned with the overhead of firing up a Python interpreter, extra processes, etc. for every archive hit. Note that private archives go through a cgi so they can enforce access rules. I think this was the right decision for the time. Chuq made some convincing arguments that even public archive access should go through a script. By generating the viewed archive message on the fly, from its native source, we'd have all kinds of control over the presentation. Such as: changing the address obfuscation rules on the fly, the ability to retract or re-publish archive messages on the fly, more advanced threading options, no artificial date divisions, the ability to change the look and feel easily, etc. With proper caching machinery and the use of more modern programmatic fulfillment of web requests (e.g. mod_python, twisted, etc.), this should be efficient enough. -Barry From PieterB at gewis.nl Mon Oct 27 17:08:54 2003 From: PieterB at gewis.nl (PieterB) Date: Mon Oct 27 17:09:00 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <3F9D08F9.6090209@student.umist.ac.uk>; from i.bapty@student.umist.ac.uk on Mon, Oct 27, 2003 at 12:00:57PM +0000 References: <3F9D08F9.6090209@student.umist.ac.uk> Message-ID: <20031027230854.A83988@gewis.win.tue.nl> On Mon, Oct 27, 2003 at 12:00:57PM +0000, Iain Bapty wrote: > Which of the functional requirements, 6 to 13, do you feel are the most > important? I think they are all quite basic. I think personally 10 and 12 are 'should-haves' in my opinion. Also have a look at the "SMART Archiver" project, http://sourceforge.net/projects/smartarchiver/ A replacement for the standard GNU Mailman archiver that supports attachments, searching, date selection, message editing and more, requires a database such as postgressql (mysql support is coming in a future version) > ... > Is there anything obvious that I have missed? Other requirements you might consider: - db support for Zope, http://zope.org - support for mime-attachements (PDF, Word, etc.) - to be able to fix threading issues through the web (people starting a new subject by reply'ing on a previous post, fix threading for mailreaders that don't support proper "In-Reply-To" threading. - Unix mbox output (based on the db). That would make it easy to upgrade, or to change to a different archiver. - support for a 'view complete thread' (this would really be nice!) - python based, because that would make it fit better with mailman and will make it easier to install - easy to use api for adding new messages (e.g. use it as an archiver for wiki discussions, such as http://zwiki.org/GeneralDiscussion ) - be able to override message-view class (e.g. so that wiki's can add wiki linking or similar features to the messages) Regards, PieterB -- If your next pot of chili tastes better, it probably is because of something left out, rather than added. From barry at python.org Mon Oct 27 17:28:50 2003 From: barry at python.org (Barry Warsaw) Date: Mon Oct 27 17:29:00 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <20031027230854.A83988@gewis.win.tue.nl> References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> Message-ID: <1067293730.1785.96.camel@anthem> On Mon, 2003-10-27 at 17:08, PieterB wrote: > Also have a look at the "SMART Archiver" project, > http://sourceforge.net/projects/smartarchiver/ > > A replacement for the standard GNU Mailman archiver that > supports attachments, searching, date selection, message > editing and more, requires a database such as postgressql > (mysql support is coming in a future version) I didn't know about that one! FWIW, I think all this competition in replacement archives is a good thing. What I really want though, is a standard interface/API/protocol between Mailman and the archives. Here's why: When Mailman decorates a message for copying to the list, I want to be able to include a link to the archived message in the footer. The problem is that there is little or no connection between the process doing the decoration and the process doing the archiving, and in fact the message may be posted to the list long before the archiver gets a crack at it. So I don't want to have to ask the archiver for that url. I want Mailman to be able to calculate it from something unique in the message, and have the archiver agree on the algorithm, so that it (or some other translation layer) can do the mapping back to the archived article. Or, Mailman should be able to calculate a unique id for the article and stuff that in a header for the archiver to index on. -Barry From PieterB at gewis.nl Mon Oct 27 17:46:54 2003 From: PieterB at gewis.nl (PieterB) Date: Mon Oct 27 17:47:03 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067293730.1785.96.camel@anthem>; from barry@python.org on Mon, Oct 27, 2003 at 05:28:50PM -0500 References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> Message-ID: <20031027234654.A85924@gewis.win.tue.nl> On Mon, Oct 27, 2003 at 05:28:50PM -0500, Barry Warsaw wrote: > > Also have a look at the "SMART Archiver" project, > > http://sourceforge.net/projects/smartarchiver/ > I didn't know about that one! It's a similar university project of the Eindhoven University of Technology. The project has just been finished and I assume all sources are/will be available. I saw the author upload the code to sf.net, and probably our host gewis.nl will host a demo environment in a couple of weeks. About coupling the archiver/mailinglist: > So I don't want to have to ask the archiver for that url. I want > Mailman to be able to calculate it from something unique in the message, > and have the archiver agree on the algorithm, so that it (or some other > translation layer) can do the mapping back to the archived article. Or, > Mailman should be able to calculate a unique id for the article and > stuff that in a header for the archiver to index on. Zwiki has implemented such functionality based on the time that the message is received/sent. E.g. a mailout for a webpost at the http://zwiki.org/GeneralDiscussion looks like this in the e-mail: (look at the generated signature, with a hyperlink to the message anchor) > There is a discussion on the mailman-developers list on the > requirements of an archiver: See: > http://news.gmane.org/gmane.mail.mailman.devel or my post at: > http://article.gmane.org/gmane.mail.mailman.devel/14954 > -- > forwarded from http://zwiki.org/GeneralDiscussion#msg20031027142214-0800@zwiki.org Off course, in this case the msgid, doesn't have to be shared between the archiver and mailinglist, because zwiki does both in one application. Regards, Pieter cc: mailman-developers lists, zwiki GeneralDiscussion -- When a broken appliance is demonstrated for the repairman, it will work perfectly. From barry at python.org Mon Oct 27 18:28:16 2003 From: barry at python.org (Barry Warsaw) Date: Mon Oct 27 18:28:20 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <20031027234654.A85924@gewis.win.tue.nl> References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <20031027234654.A85924@gewis.win.tue.nl> Message-ID: <1067297296.1066.15.camel@anthem> On Mon, 2003-10-27 at 17:46, PieterB wrote: > It's a similar university project of the Eindhoven University of > Technology. The project has just been finished and I assume all > sources are/will be available. I saw the author upload the code to > sf.net, and probably our host gewis.nl will host a demo environment > in a couple of weeks. Cool! > About coupling the archiver/mailinglist: > > > So I don't want to have to ask the archiver for that url. I want > > Mailman to be able to calculate it from something unique in the message, > > and have the archiver agree on the algorithm, so that it (or some other > > translation layer) can do the mapping back to the archived article. Or, > > Mailman should be able to calculate a unique id for the article and > > stuff that in a header for the archiver to index on. > > Zwiki has implemented such functionality based on the time that the > message is received/sent. E.g. a mailout for a webpost at the > http://zwiki.org/GeneralDiscussion looks like this in the e-mail: > (look at the generated signature, with a hyperlink to the message > anchor) > > > There is a discussion on the mailman-developers list on the > > requirements of an archiver: See: > > http://news.gmane.org/gmane.mail.mailman.devel or my post at: > > http://article.gmane.org/gmane.mail.mailman.devel/14954 > > -- > > forwarded from http://zwiki.org/GeneralDiscussion#msg20031027142214-0800@zwiki.org > > Off course, in this case the msgid, doesn't have to be shared between > the archiver and mailinglist, because zwiki does both in one > application. That's not bad (probably better than the sha hexdigest I'm usually so fond of :), but yep we need to agree on a way to pass that information to the archiver. Mailman does add a unique header specifying the time of arrival, but I suggest a special X- header that Mailman can insert and the archiver can read. Anybody know of any prior art here? -Barry From chuqui at plaidworks.com Mon Oct 27 18:33:15 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Mon Oct 27 18:33:23 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067292218.1785.88.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <3F9D91B9.2090109@student.umist.ac.uk> <1067292218.1785.88.camel@anthem> Message-ID: and I'm working on an update of that based on some new ideas I have. stay tuned. (but don't hold your breath, not these days...) FWIW, I vote for storing it in a database. By using MyISAM files and splitting on listname/time, you can build lots of smaller files and use merge tables to dynamically throw them together as needed, without building really bloody huge tables. a nice compromise, but you get all sorts of fun stuff that way, easy dynamic indexing, some usable search engine stuff, etc.... On Oct 27, 2003, at 2:03 PM, Barry Warsaw wrote: > Chuq made some convincing arguments that even public archive access > should go through a script. By generating the viewed archive message > on > the fly, from its native source, we'd have all kinds of control over > the > presentation. Such as: changing the address obfuscation rules on the > fly, the ability to retract or re-publish archive messages on the fly, > more advanced threading options, no artificial date divisions, the > ability to change the look and feel easily, etc. With proper caching > machinery and the use of more modern programmatic fulfillment of web > requests (e.g. mod_python, twisted, etc.), this should be efficient > enough. From i.bapty at student.umist.ac.uk Mon Oct 27 18:38:09 2003 From: i.bapty at student.umist.ac.uk (Iain Bapty) Date: Mon Oct 27 18:37:51 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <20031027230854.A83988@gewis.win.tue.nl> References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> Message-ID: <3F9DAC61.1050805@student.umist.ac.uk> PieterB wrote: >I think they are all quite basic. I think personally 10 and 12 are >'should-haves' in my opinion. > > I may have underestimated the time it will take me to implement them, but I am quite inexperienced in projects of this nature. The closest I have done is some ASP.NET and VB.NET front ends to MS SQL databases as part of my summer job. If I get to the stage where I can implement more than those requirements and I have the time, then I may. >Also have a look at the "SMART Archiver" project, >http://sourceforge.net/projects/smartarchiver/ > > A replacement for the standard GNU Mailman archiver that > supports attachments, searching, date selection, message > editing and more, requires a database such as postgressql > (mysql support is coming in a future version) > > I will evaluate this as part of my candidate re-use components analysis in my report. I was hoping that I would be the first to actually put together a new archiver for Mailman, oh well. >Other requirements you might consider: > >- db support for Zope, http://zope.org > > I may do this depending on my design decisions. >- Unix mbox output (based on the db). That would make it easy to >upgrade, or to change to a different archiver. > > If achieve my non-functional requirements this should be fairly straightford to implement. >- support for a 'view complete thread' (this would really be nice!) > > I'm not sure I understand what you mean by this. The type of interface I am aiming for can be seen in Ka-Ping Yee's Zest prototype at http://www.zesty.ca/zest >- python based, because that would make it fit better with mailman >and will make it easier to install > > Definately. >- easy to use api for adding new messages (e.g. use it as an archiver >for wiki discussions, such as http://zwiki.org/GeneralDiscussion ) > > From what Barry Warsaw has told me, Mailman supports external archivers that provide a command-line client. Not exactly an API, but if I choose to use this then it would be fairly straightforward to adapt the archiver for other uses. Thanks a lot for your feedback and the SMART Archiver link. Iain From tkikuchi at is.kochi-u.ac.jp Mon Oct 27 18:42:24 2003 From: tkikuchi at is.kochi-u.ac.jp (Tokio Kikuchi) Date: Mon Oct 27 18:42:56 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <3F9D08F9.6090209@student.umist.ac.uk> References: <3F9D08F9.6090209@student.umist.ac.uk> Message-ID: <3F9DAD60.1020005@is.kochi-u.ac.jp> Hi, I should add > 8. be MIME aware. 8'. be I18N. Cheers, -- Tokio Kikuchi, tkikuchi@ is.kochi-u.ac.jp http://weather.is.kochi-u.ac.jp/ From barry at python.org Mon Oct 27 18:47:05 2003 From: barry at python.org (Barry Warsaw) Date: Mon Oct 27 18:47:09 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <3F9D91B9.2090109@student.umist.ac.uk> <1067292218.1785.88.camel@anthem> Message-ID: <1067298425.1066.29.camel@anthem> On Mon, 2003-10-27 at 18:33, Chuq Von Rospach wrote: > and I'm working on an update of that based on some new ideas I have. > stay tuned. (but don't hold your breath, not these days...) I can imagine, what with G5's, Windows iTunes and Panther. :) > FWIW, I vote for storing it in a database. By using MyISAM files and > splitting on listname/time, you can build lots of smaller files and use > merge tables to dynamically throw them together as needed, without > building really bloody huge tables. a nice compromise, but you get all > sorts of fun stuff that way, easy dynamic indexing, some usable search > engine stuff, etc... MyISAM tables aren't transactional. Would we care? Probably not for this application, but for my Mailman 3 experiments, I'm storing list and user data in transactional BerkeleyDB tables because I definitely think we want that extra safety. -Barry From chuqui at plaidworks.com Mon Oct 27 18:54:37 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Mon Oct 27 18:54:42 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067298425.1066.29.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <3F9D91B9.2090109@student.umist.ac.uk> <1067292218.1785.88.camel@anthem> <1067298425.1066.29.camel@anthem> Message-ID: On Oct 27, 2003, at 3:47 PM, Barry Warsaw wrote: > MyISAM tables aren't transactional. Would we care? Probably not for > this application, but for my Mailman 3 experiments, I'm storing list > and > user data in transactional BerkeleyDB tables because I definitely think > we want that extra safety. > very unlikely for archives. And with mySQL 4, you can use one of the newer formats with row locking and transactions. they do intermingle nicely. From dgc at uchicago.edu Mon Oct 27 19:02:47 2003 From: dgc at uchicago.edu (David Champion) Date: Mon Oct 27 19:03:11 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: <1067285548.1785.62.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> Message-ID: <20031028000247.GD5392@dust.uchicago.edu> > exhaustion. Or you could store everything in an mbox file with a file > offset index. Or perhaps store everything to an nntp server (Twisted > would make a nice platform for this ). > ... > Also, I really want the next generation archiver to do everything > through cgi (or equivalent programmatic interface). The ability to > massage the messages on the way out to me outweighs the benefits of > vending messages directly from the file system. Well, since you bring this up.... I've been giving this some thought over the last few weeks, since this latest fit of discussions about archivers cropped up. I've written up some code to address the problem to my satisfaction, along with a quick draft manifesto to explain myself. It's too long to inline here, but I put a copy on the web: http://home.uchicago.edu/~dgc/sw/mmimap/ Meanwhile, to cut to the chase: I decided IMAP is the way to handle this, and I've implemented what I need to provide it for both public and private lists. There are scripts to extract authentication material from Mailman, and an IMAP proxy daemon that performs authentication and sets up an environment to hand off to UW-IMAP. I've tested on our production server with a restricted set of users. No complaints, and all the testers approve of the approach. Our server needs an upgrade before it's powerful enough to do IMAP for 2000 lists (67,000 subscribers), but it's tentatively the way we plan to go. We probably won't enable HTML archival after the upgrade. We already have a webmail product in place, but if we didn't we could just plug that in on the list server to provide the HTTP access. I realize that IMAP isn't ideal for all sites or lists, but I think it should work well for our purposes, where lists are mostly institutional, and not so public that they need to be Googled. I'm hoping to get these materials better integrated and documented soon, maybe once I'm back from LISA. But in case anyone is interested in working with them, I've put them up on the web, linked from the above URL. If this were to be a standard solution rather than a local hack, it would probably need some refactoring for other IMAP daemons, for newer MM authenticators, etc. I'm sure I haven't done the best as can be done, and I'd certainly rather see IMAP access to archives be a standard component of (or interface to) list server software, but it's a pleasing start. -- -D. dgc@uchicago.edu University of Chicago > NSIT > VDN > ENSS > ENSA > You are here . . . . . . . always line up dots From barry at python.org Mon Oct 27 19:16:35 2003 From: barry at python.org (Barry Warsaw) Date: Mon Oct 27 19:16:40 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: <20031028000247.GD5392@dust.uchicago.edu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> Message-ID: <1067300194.1066.39.camel@anthem> On Mon, 2003-10-27 at 19:02, David Champion wrote: > I'm hoping to get these materials better integrated and documented soon, > maybe once I'm back from LISA. But in case anyone is interested in > working with them, I've put them up on the web, linked from the above > URL. If this were to be a standard solution rather than a local hack, > it would probably need some refactoring for other IMAP daemons, for > newer MM authenticators, etc. I'm sure I haven't done the best as can be > done, and I'd certainly rather see IMAP access to archives be a standard > component of (or interface to) list server software, but it's a pleasing > start. One of the reasons why I'm so interested in Twisted for MM3 is so we can provide both IMAP and NNTP access to the message store, almost for free. Which does point to an alternative direction -- maybe we don't need any direct connection to an html archive. Maybe the archiver should just be a separate process that reads messages from the NNTP interface a MM3 might export. Just blue-skying here. -Barry From kmccann at bellanet.org Mon Oct 27 21:10:09 2003 From: kmccann at bellanet.org (Kevin McCann) Date: Mon Oct 27 20:54:28 2003 Subject: [Mailman-Developers] Requirements for a new archiver References: <3F9D08F9.6090209@student.umist.ac.uk><1067285190.16085.79.camel@localhost.localdomain><1067285548.1785.62.camel@anthem><3F9D91B9.2090109@student.umist.ac.uk><1067292218.1785.88.camel@anthem><1067298425.1066.29.camel@anthem> Message-ID: <00e901c39cf8$9cc34050$0501a8c0@kevinduron> Chuq said: > very unlikely for archives. And with mySQL 4, you can use one of the > newer formats with row locking and transactions. they do intermingle > nicely. Yes. MySQL can handle transactions just fine. For more info: http://www.mysql.com/doc/en/ANSI_diff_Transactions.html - Kevin From claw at kanga.nu Tue Oct 28 00:38:23 2003 From: claw at kanga.nu (J C Lawrence) Date: Tue Oct 28 00:38:30 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Barry Warsaw of "Mon, 27 Oct 2003 15:12:29 EST." <1067285548.1785.62.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> Message-ID: <13752.1067319503@kanga.nu> On Mon, 27 Oct 2003 15:12:29 -0500 Barry Warsaw wrote: > On Mon, 2003-10-27 at 15:06, Kevin McCann wrote: > There are a couple of ways I could see slicing things. You could > store one message per file a la MH, with some elaboration to avoid > inode exhaustion. Or you could store everything in an mbox file with > a file offset index. Or perhaps store everything to an nntp server > (Twisted would make a nice platform for this ). It would take little to bolt the NNTP supports from Twisted into Mailman and then grab something like MeoWWW for the archive presentation. Its quite useful as-is. > What would then be in the database would be records providing easy > lookup by message-id (at least) into the on-disk message store. Bingo, plus the option of opening the NNTP interface to external users as another list reading/posting method. Now Mailman can become an all-in-one lightweight archiving news and mailing list system for precious little expense. > Also, I really want the next generation archiver to do everything > through cgi (or equivalent programmatic interface). The ability to > massage the messages on the way out to me outweighs the benefits of > vending messages directly from the file system. Agreed. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Tue Oct 28 00:41:54 2003 From: claw at kanga.nu (J C Lawrence) Date: Tue Oct 28 00:41:58 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Barry Warsaw of "Mon, 27 Oct 2003 17:28:50 EST." <1067293730.1785.96.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> Message-ID: <17914.1067319714@kanga.nu> On Mon, 27 Oct 2003 17:28:50 -0500 Barry Warsaw wrote: > On Mon, 2003-10-27 at 17:08, PieterB wrote: > When Mailman decorates a message for copying to the list, I want to be > able to include a link to the archived message in the footer. The > problem is that there is little or no connection between the process > doing the decoration and the process doing the archiving, and in fact > the message may be posted to the list long before the archiver gets a > crack at it. If the URL is predictably based on the Message-ID this is not a problem. > So I don't want to have to ask the archiver for that url. I want > Mailman to be able to calculate it from something unique in the > message, and have the archiver agree on the algorithm, so that it (or > some other translation layer) can do the mapping back to the archived > article. Or, Mailman should be able to calculate a unique id for the > article and stuff that in a header for the archiver to index on. Quite, this is how/why NNTP uses Message-IDs are unique indexing qualifiers. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Tue Oct 28 00:44:43 2003 From: claw at kanga.nu (J C Lawrence) Date: Tue Oct 28 00:44:47 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: Message from Barry Warsaw of "Mon, 27 Oct 2003 19:16:35 EST." <1067300194.1066.39.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> Message-ID: <21426.1067319883@kanga.nu> On Mon, 27 Oct 2003 19:16:35 -0500 Barry Warsaw wrote: > Which does point to an alternative direction -- maybe we don't need > any direct connection to an html archive. Maybe the archiver should > just be a separate process that reads messages from the NNTP interface > a MM3 might export. Just blue-skying here. +1 -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Tue Oct 28 00:53:26 2003 From: claw at kanga.nu (J C Lawrence) Date: Tue Oct 28 00:53:30 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: Message from J C Lawrence of "Tue, 28 Oct 2003 00:44:43 EST." <21426.1067319883@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> <21426.1067319883@kanga.nu> Message-ID: <30982.1067320406@kanga.nu> On Tue, 28 Oct 2003 00:44:43 -0500 J C Lawrence wrote: > On Mon, 27 Oct 2003 19:16:35 -0500 Barry Warsaw > wrote: >> Which does point to an alternative direction -- maybe we don't need >> any direct connection to an html archive. Maybe the archiver should >> just be a separate process that reads messages from the NNTP >> interface a MM3 might export. Just blue-skying here. > +1 More simply I've come to want a clean abstraction in the work cases: storage, indexing, presentation. The three are different, largely unrelated, and I see little reason (non in my case) to even have them done by related or similar tools. As such the questions come out to" What's an efficient way to store messages? What's an efficient way to retrieve messages by a primary index? What's an efficient way to build other indexes to that primary key? How do I want to present the data? For me that came out to inn2, Message-IDs, We:Search, and a mix of a MeoWWW derivative (HTTP) and NNTP straight to inn2. The critical point is that each bit of that amalgamation is independent and can be trivially swapped out for something else that suits a local use case better. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From i.bapty at student.umist.ac.uk Tue Oct 28 04:06:34 2003 From: i.bapty at student.umist.ac.uk (Iain Bapty) Date: Tue Oct 28 04:07:16 2003 Subject: [Mailman-Developers] Archiver Message-ID: <3F9E319A.3020203@student.umist.ac.uk> Hi, As you will probably know by my few posts. I had been intending on producing an archiver for my project. Unfortunately, yesterday I learned about a project (SMART Archiver) that pretty much does everything I was intending to do, which makes me pretty redundant. I'm quite desperate, Barry suggested that I ask here if anyone can suggest any other work I could do for my project? My project has to be a substantial piece of individual work that actually produces some kind of software artifact. It should be produced independently, I shouldn't be reliant on anyone else in order to complete the project. It also has to be challenging, in order to get good marks I have to solve problems, it cannot just be something like bug fixing an existing program. I am meant to spend 200 hours on my project, including report writing and presentation writing. Thanks for any advice. Iain From PieterB at gewis.nl Tue Oct 28 04:42:23 2003 From: PieterB at gewis.nl (PieterB) Date: Tue Oct 28 04:42:27 2003 Subject: [Mailman-Developers] Archiver In-Reply-To: <3F9E319A.3020203@student.umist.ac.uk>; from i.bapty@student.umist.ac.uk on Tue, Oct 28, 2003 at 09:06:34AM +0000 References: <3F9E319A.3020203@student.umist.ac.uk> Message-ID: <20031028104223.A7600@gewis.win.tue.nl> On Tue, Oct 28, 2003 at 09:06:34AM +0000, Iain Bapty wrote: > As you will probably know by my few posts. I had been intending on > producing an archiver for my project. Unfortunately, yesterday I learned > about a project (SMART Archiver) that pretty much does everything I was > intending to do, which makes me pretty redundant. Sorry, to shatter your plans. I wouldn't call you or a new archiver project redundant. I think it was very good of you to ask the community for input on your ideas. > I'm quite desperate, Barry suggested that I ask here if anyone can > suggest any other work I could do for my project? > > My project has to be a substantial piece of individual work that > actually produces some kind of software artifact. It should be produced > independently, I shouldn't be reliant on anyone else in order to > complete the project. It also has to be challenging, in order to get > good marks I have to solve problems, it cannot just be something like > bug fixing an existing program. I am meant to spend 200 hours on my > project, including report writing and presentation writing. Well, I think you underestimated the work of creating an archiver you described (especially if you would have implemented every wish of the mailman community ;). I don't know how long the SmartArchiver people worked on the project, but I think it was done with more people and more time. Things you might consider: - Extension to SmartArchiver/mailman2/mailman3. I think it should be possible to define a project which you can do on your own, and contribute to other projects. E.g. creating a good persistance layer for an archiver, for example using Ape: http://hathaway.freezope.org/Software/Ape - making a good spam handling mechanism for mailman (including the ability to approve messages through the web, be able to monitor the spam, etc) and being able to report spam-messages to Pyzor/Razor/DCC. There are a lot of parts available, but I think good integration of those parts is lacking at this moment. I really think you could solve a problem for quite some people with this. See http://article.gmane.org/gmane.mail.mailman.devel/14910/ for my ideas. - ask the same question on the zope-dev or zope3-dev mailinglist. There aren't a lot applications for Zope3, and there may be great ideas for such small projects. > Thanks for any advice. Your welcome, Good luck, Pieter -- http://zwiki.org/PieterB From gary.frederick at jsoft.com Tue Oct 28 05:19:57 2003 From: gary.frederick at jsoft.com (Gary Frederick) Date: Tue Oct 28 05:20:06 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067293730.1785.96.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> Message-ID: <3F9E42CD.1070908@jsoft.com> Barry Warsaw wrote: trim > > FWIW, I think all this competition in replacement archives is a good > thing. What I really want though, is a standard interface/API/protocol > between Mailman and the archives. Here's why: trim Yes!!! An API lets us have choices. What does it take to get the standard interface/API/protocol? Gary From mch at cix.compulink.co.uk Tue Oct 28 08:09:00 2003 From: mch at cix.compulink.co.uk (Mike Holderness) Date: Tue Oct 28 08:09:55 2003 Subject: [Mailman-Developers] == No Subject == Message-ID: On Mon, 27 Oct 2003 09:15:56 -0500 (EST) Dale Newfield wrote: > Subject: Re: [Mailman-Developers] Requirements for a new archiver > > On Mon, 27 Oct 2003, Iain Bapty wrote: > > 6. allow for full-text searching of the archives. > > 7. allow for filtering by date, author, and/or topic. > > I find any archiver without at least 6 and likely 7 to be unusable, and > an incredible waste of the user's time. But don't let that put you off doing what you can inside the 200-hour limit :-) A thread-only archive is better than nothing, at least for a small list... That said, I have my must-have for an archiver: > > 9. allow archives to be set as public or private. and my want-a-lot: > > 3. provide a web-based interface to those email-discussions. This will be based on HTML templates with variables in the same style as MailMan? Please? Mike From amk at amk.ca Tue Oct 28 08:31:48 2003 From: amk at amk.ca (amk@amk.ca) Date: Tue Oct 28 08:31:55 2003 Subject: [Mailman-Developers] == No Subject == In-Reply-To: References: Message-ID: <20031028133148.GA1229@rogue.amk.ca> On Tue, Oct 28, 2003 at 01:09:00PM +0000, Mike Holderness wrote: > and my want-a-lot: > > > 3. provide a web-based interface to those email-discussions. This seems to require a serious amount of work, enough that Mailman stops being just a mailing list manager and starts looking like a generic message storage system that supports multiple interfaces (mail, web, NNTP). That's a fine change of direction for Mailman 3.0, should Barry want to pursue it, but it turns everything on its head. Currently Mailman expends a lot of effort on managing users and once a message is archived, Mailman never really deals with it again. In the new model, messages are just as important as users. Iain, a generic message store might provide a good project idea for you. It would be able to store messages, keeping track of author/subject/date/etc. and handling threading. It would be high-capacity, able to store lots of messages and access them quickly. There would be some defined API for adding messages and for searching them; input could come from mail messages, web forms, or NNTP postings, but you'd only do proof-of-concept implementations, and maybe not all three of them. Archived messages can be viewed via NNTP and a Web interface; again, you might only produce a proof-of-concept of these. --amk From amk at amk.ca Tue Oct 28 08:48:13 2003 From: amk at amk.ca (amk@amk.ca) Date: Tue Oct 28 08:48:18 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: <30982.1067320406@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> <21426.1067319883@kanga.nu> <30982.1067320406@kanga.nu> Message-ID: <20031028134813.GB1229@rogue.amk.ca> On Tue, Oct 28, 2003 at 12:53:26AM -0500, J C Lawrence wrote: > For me that came out to inn2, Message-IDs, We:Search, and a mix of a > MeoWWW derivative (HTTP) and NNTP straight to inn2. The critical point It's certainly possible to assemble a set of packages that provide a featureful search, but does this raise the bar for installing Mailman too much? SMARTarchiver would require that you have PostgreSQL; is adding that dependency OK? Barry, what's your opinion? --amk From kmccann at bellanet.org Tue Oct 28 09:45:21 2003 From: kmccann at bellanet.org (Kevin McCann) Date: Tue Oct 28 09:46:40 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: <20031028134813.GB1229@rogue.amk.ca> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> <21426.1067319883@kanga.nu> <30982.1067320406@kanga.nu> <20031028134813.GB1229@rogue.amk.ca> Message-ID: <1067352320.16085.151.camel@localhost.localdomain> On Tue, 2003-10-28 at 08:48, amk@amk.ca wrote: > It's certainly possible to assemble a set of packages that provide a > featureful search, but does this raise the bar for installing Mailman too > much? SMARTarchiver would require that you have PostgreSQL; is adding that > dependency OK? Barry, what's your opinion? And the SmartArchiver developer is planning to support MySQL, too. When this happens you'd only need to have one of the two databases. - Kevin From juanen at metropoli2000.com Tue Oct 28 09:56:16 2003 From: juanen at metropoli2000.com (Juan Enrique =?ISO-8859-1?Q?G=F3mez?=) Date: Tue Oct 28 09:56:56 2003 Subject: [Mailman-Developers] LOCK files craziness Message-ID: <1067352975.13105.6.camel@amspoke> Hi! I have realized that when someone tries to use the admin html interface, the python script generates a lock file, but sometimes due to the very big database for somelists, the script is killed by apache due a timeout, and the lock file is on the locks directory till i come and manually delete it, the people tries to reload the page, generating additional locks, and the lists stops working at all. Is there any way i can fix this?, our could mailman daemon check for stale lock files and delete them when the process is not running? Thanks in advance. -- -------------------------------------------------------------------------- |Juan Enrique Gomez Perez |Metropoli2000 Networks, S.L. | Phone: +34 914250023 Fax: +34 914250136 | email: juan.enrique.gomez@metropoli2000.com -------------------------------------------------------------------------- PGP Fingerprint: 6B39 3A2B A17B 1E8E CFFD FC14 678E 0A22 BD80 C486 Public Key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xBD80C486 If this message hasn't a correct signature please notify it to juanen@metropoli2000.com. To get the public key please use the above url. Thanks for your help. -------------------------------------------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Esta parte del mensaje =?ISO-8859-1?Q?est=E1?= firmada digitalmente Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031028/c9efa01e/attachment-0001.bin From amk at amk.ca Tue Oct 28 10:10:00 2003 From: amk at amk.ca (amk@amk.ca) Date: Tue Oct 28 10:10:04 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: <1067352320.16085.151.camel@localhost.localdomain> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> <21426.1067319883@kanga.nu> <30982.1067320406@kanga.nu> <20031028134813.GB1229@rogue.amk.ca> <1067352320.16085.151.camel@localhost.localdomain> Message-ID: <20031028151000.GA1664@rogue.amk.ca> On Tue, Oct 28, 2003 at 09:45:21AM -0500, Kevin McCann wrote: > And the SmartArchiver developer is planning to support MySQL, too. When > this happens you'd only need to have one of the two databases. Yeah, but installing MySQL isn't trivial either. --amk From claw at kanga.nu Tue Oct 28 10:24:45 2003 From: claw at kanga.nu (J C Lawrence) Date: Tue Oct 28 10:24:57 2003 Subject: [Mailman-Developers] Archiver In-Reply-To: Message from Iain Bapty of "Tue, 28 Oct 2003 09:06:34 GMT." <3F9E319A.3020203@student.umist.ac.uk> References: <3F9E319A.3020203@student.umist.ac.uk> Message-ID: <14305.1067354685@kanga.nu> On Tue, 28 Oct 2003 09:06:34 +0000 Iain Bapty wrote: > I'm quite desperate, Barry suggested that I ask here if anyone can > suggest any other work I could do for my project? A project that is near and dear to my heart, and which I have on my TODO list for that rainy day is to extend TMDA into being a sample implementation of Yakov Shafranovich's consent system as defined here: http://www.ietf.org/internet-drafts/draft-irtf-asrg-cri-00.txt It would be an excellent thing... -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Tue Oct 28 10:18:33 2003 From: claw at kanga.nu (J C Lawrence) Date: Tue Oct 28 10:29:20 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: Message from amk@amk.ca of "Tue, 28 Oct 2003 08:48:13 EST." <20031028134813.GB1229@rogue.amk.ca> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> <21426.1067319883@kanga.nu> <30982.1067320406@kanga.nu> <20031028134813.GB1229@rogue.amk.ca> Message-ID: <7319.1067354313@kanga.nu> On Tue, 28 Oct 2003 08:48:13 -0500 amk wrote: > On Tue, Oct 28, 2003 at 12:53:26AM -0500, J C Lawrence wrote: > SMARTarchiver would require that you have PostgreSQL; is adding that > dependency OK? Barry, what's your opinion? I wouldn't consider it an acceptable dependency. This is where things like We:Search and the like come in (albeit We:Search is optimised for far larger message stores than Mailman typically handles). You get a simple tool which does something simple, providing the basics, and which can be easily extracted and replaced plugin-style should that be desired. Mailman is growing up. It is rapidly becoming a message processing system or framework, not just a mailing list manager. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Tue Oct 28 11:41:58 2003 From: claw at kanga.nu (J C Lawrence) Date: Tue Oct 28 11:42:08 2003 Subject: [Mailman-Developers] please, help with mm bugs (fwd) Message-ID: <7427.1067359318@kanga.nu> An embedded message was scrubbed... From: "Pablo Chamorro C." Subject: please, help with mm bugs Date: Tue, 28 Oct 2003 11:18:28 -0500 (COT) Size: 6286 Url: http://mail.python.org/pipermail/mailman-developers/attachments/20031028/2a91169a/attachment.mht From barry at python.org Tue Oct 28 12:42:21 2003 From: barry at python.org (Barry Warsaw) Date: Tue Oct 28 12:43:08 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <13752.1067319503@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <13752.1067319503@kanga.nu> Message-ID: <1067362940.1235.34.camel@geddy> On Tue, 2003-10-28 at 00:38, J C Lawrence wrote: > It would take little to bolt the NNTP supports from Twisted into Mailman > and then grab something like MeoWWW for the archive presentation. Its > quite useful as-is. I'm google-fu is low today. I'm not finding any usable links to MeoWWW. I found some reference to it on newsreaders.com, but both links were dead. > Bingo, plus the option of opening the NNTP interface to external users > as another list reading/posting method. Now Mailman can become an > all-in-one lightweight archiving news and mailing list system for > precious little expense. I think you're reading my mind. :) -Barry From barry at python.org Tue Oct 28 12:49:51 2003 From: barry at python.org (Barry Warsaw) Date: Tue Oct 28 12:50:38 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <17914.1067319714@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <17914.1067319714@kanga.nu> Message-ID: <1067363390.1235.40.camel@geddy> On Tue, 2003-10-28 at 00:41, J C Lawrence wrote: > If the URL is predictably based on the Message-ID this is not a problem. > Quite, this is how/why NNTP uses Message-IDs are unique indexing > qualifiers. Yep, and we'd have to do the same thing (ensure that Message-IDs are unique). Note that we sometimes get shit from people who complain about Mailman's NNTP posting code modifying Message-IDs to adhere to the stricter NNTP requirements. But Mailman can't rely on the good graces of remote mail tools to ensure globally unique Message-IDs, so it has to check and munge if it gets a dup (or, it's within it's right to treat a dup /as/ a dup, e.g. discarding it). -Barry From claw at kanga.nu Tue Oct 28 13:13:17 2003 From: claw at kanga.nu (J C Lawrence) Date: Tue Oct 28 13:13:24 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Barry Warsaw of "Tue, 28 Oct 2003 12:42:21 EST." <1067362940.1235.34.camel@geddy> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <13752.1067319503@kanga.nu> <1067362940.1235.34.camel@geddy> Message-ID: <17724.1067364797@kanga.nu> On Tue, 28 Oct 2003 12:42:21 -0500 Barry Warsaw wrote: > On Tue, 2003-10-28 at 00:38, J C Lawrence wrote: >> It would take little to bolt the NNTP supports from Twisted into >> Mailman and then grab something like MeoWWW for the archive >> presentation. Its quite useful as-is. > I'm google-fu is low today. I'm not finding any usable links to > MeoWWW. I found some reference to it on newsreaders.com, but both > links were dead. Yeah, I remember it being hard to find, but it is out there. The following holds a copy of what I have: ftp://ftp.kanga.nu/pub/users/claw/odd/meowww.tar.gz > I think you're reading my mind. :) Heck, I may yet cash in that beer I owe you. Eurogames is coming up in early November, and I might just be persuaded to head down for the Sunday... http://euroquest.gamesclubofmd.org/ -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Tue Oct 28 13:30:23 2003 From: claw at kanga.nu (J C Lawrence) Date: Tue Oct 28 13:30:37 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Barry Warsaw of "Tue, 28 Oct 2003 12:49:51 EST." <1067363390.1235.40.camel@geddy> References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <17914.1067319714@kanga.nu> <1067363390.1235.40.camel@geddy> Message-ID: <4447.1067365823@kanga.nu> On Tue, 28 Oct 2003 12:49:51 -0500 Barry Warsaw wrote: > On Tue, 2003-10-28 at 00:41, J C Lawrence wrote: >> If the URL is predictably based on the Message-ID this is not a >> problem. >> Quite, this is how/why NNTP uses Message-IDs are unique indexing >> qualifiers. > Yep, and we'd have to do the same thing (ensure that Message-IDs are > unique). Yup. Of course this heads directly into that beautiful debate of whether MLMs should rewrite Message IDs. Summarising briefly: If we rewrite all IDs we'll piss off the people who use ID to do dupe detection/deletion for courtesy copies. If we don't do some rewriting some messages won't make it through NNTP and some other people will be pissed off. Two contrasting approaches: 1) We guarantee uniqueness of all Message IDs. The only way to do this is to rewrite all IDs. This will piss off some people. 2) We best-effort guarantee uniqueness by only guaranteeing uniqueness within the last N messages to the list. This could be one by rewriting all IDs, in which case we might as well guarantee total uniqueness, or it could be done by keeping a DB of the last N (cf CDBD) and either discarding or rewriting detected collisions. This of course means that some messages will be discarded by NNTP and we won't know about it. Some may be willing to accept those risks. Premise: Mailman just doesn't have enough configuration options. My temptation would be to add the following configuration options to Mailman: Keep track of the Messages IDs we've seen for the last N messages? (0 disables, value sets N) How to handle messages with IDs we've seen before? (discard/rewrite radio button) Replace the Message IDs for all the messages we rebroadcast with our own unique values? (checkbox) > Note that we sometimes get shit from people who complain about > Mailman's NNTP posting code modifying Message-IDs to adhere to the > stricter NNTP requirements. Aye. It is very easy to criticise. It is a little less easy to define, implement and maintain solutions which can't be criticised. > But Mailman can't rely on the good graces of remote mail tools to > ensure globally unique Message-IDs, so it has to check and munge if it > gets a dup (or, it's within it's right to treat a dup /as/ a dup, > e.g. discarding it). Yeah, thus the above logic. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From jam at jamux.com Tue Oct 28 14:43:22 2003 From: jam at jamux.com (John A. Martin) Date: Tue Oct 28 14:47:11 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <4447.1067365823@kanga.nu> (J. C. Lawrence's message of "Tue, 28 Oct 2003 13:30:23 -0500") References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <17914.1067319714@kanga.nu> <1067363390.1235.40.camel@geddy> <4447.1067365823@kanga.nu> Message-ID: <87ptgh16x1.fsf@athene.jamux.com> >>>>> "claw" == J C Lawrence >>>>> "Re: [Mailman-Developers] Requirements for a new archiver " >>>>> Tue, 28 Oct 2003 13:30:23 -0500 claw> 1) We guarantee uniqueness of all Message IDs. The only way claw> to do this is to rewrite all IDs. This will piss off some claw> people. Less perhaps if you use Resent-Msg-Id, ala rfc2822 Section 3.6.6, retaining the original. jam -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 154 bytes Desc: not available Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031028/5249dd58/attachment.bin From dgc at uchicago.edu Tue Oct 28 15:30:36 2003 From: dgc at uchicago.edu (David Champion) Date: Tue Oct 28 15:31:11 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: <1067300194.1066.39.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> Message-ID: <20031028203036.GH5392@dust.uchicago.edu> * On 2003.10.27, in <1067300194.1066.39.camel@anthem>, * "Barry Warsaw" wrote: > > Which does point to an alternative direction -- maybe we don't need any > direct connection to an html archive. Maybe the archiver should just be > a separate process that reads messages from the NNTP interface a MM3 > might export. Just blue-skying here. That's pretty much the ideological basis for what I have done. We have message-delivery protocols, and tools that know about messages; why keep trying to reinvent them over HTTP? My ideal list manager would export IMAP and/or NNTP interfaces, or would have a channel for providing messages and authentication to something else that exposes IMAP or NNTP (which is the route I took). Nobody needs web access: what they need is access via a web browser. With browsers that understand NNTP and IMAP prevalent, and with a wide selection of web-mail and web-news gateways for the cases where that doesn't work, this is sufficient. I favor IMAP over NNTP for this: 1. it appeals more to the way regular people think about lists: it's their mail, only it's on a server. Most people aren't much aware of or concerned with the similarities between news/NNTP and mail/IMAP. 2. many people have IMAP software. Fewer have or understand how to use NNTP software. 3. my server has mostly private lists, and I'm unsatisfied with the state of NNTP authentication compared to IMAP authentication. I want this primarily for archives that people need to authenticte to, not lists whose archives should be exposed to the public. But integrating with both is even better. -- -D. dgc@uchicago.edu University of Chicago > NSIT > VDN > ENSS > ENSA > You are here . . . . . . . always line up dots From jam at jamux.com Tue Oct 28 15:35:45 2003 From: jam at jamux.com (John A. Martin) Date: Tue Oct 28 15:36:02 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <3F9D08F9.6090209@student.umist.ac.uk> (Iain Bapty's message of "Mon, 27 Oct 2003 12:00:57 +0000") References: <3F9D08F9.6090209@student.umist.ac.uk> Message-ID: <87fzhd14ny.fsf@athene.jamux.com> >>>>> "Iain" == Iain Bapty >>>>> "[Mailman-Developers] Requirements for a new archiver" >>>>> Mon, 27 Oct 2003 12:00:57 +0000 Iain> Functional Requirements The archive component should Iain> 1. store email discussions. Iain> 2. integrate with Mailman. Iain> 3. provide a web-based interface to those email-discussions. Iain> 4. provide an interface that threads discussions by their Iain> content. (ZEST) Iain> 5. provide an interface that threads discussions by e-mail Iain> replies. Iain> 6. allow for full-text searching of the archives. Iain> 7. allow for filtering by date, author, and/or topic. Iain> 8. be MIME aware. Iain> 9. allow archives to be set as public or private. Iain> 10. allow posts to be added, deleted, and modified through Iain> web interface. Iain> 11. allow archives to be locked to prevent modification. Iain> 12. allow postings to be emailed. Iain> 13. allow postings to be referenced externally. Iain> There are a two reasons I am posting this. Iain> Is there anything obvious that I have missed? I hope 13 means that specific (list, range) messages can be retrieved from the archive by mail like Smartlist. 5 might want to include or allow choice to thread by subject or references as well like Gnus. Most importantly, 11, locking by site admin overriding virtual domain admin overriding list owner must IMHO be a prerequisite to allowing anybody to rewrite history (item 10). Iain> Which of the functional requirements, 6 to 13, do you feel Iain> are the most important? (As part of my report I have to Iain> analyse the requirements captured) 13 (like Smartlist), 9, 6, 7, ..., (11 before 10) HTH jam -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 154 bytes Desc: not available Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031028/5e8beebd/attachment.bin From chuqui at plaidworks.com Tue Oct 28 15:48:46 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Tue Oct 28 15:48:52 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: <20031028203036.GH5392@dust.uchicago.edu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> <20031028203036.GH5392@dust.uchicago.edu> Message-ID: <20485453-0988-11D8-A02B-0003934516A8@plaidworks.com> On Oct 28, 2003, at 12:30 PM, David Champion wrote: > message-delivery protocols, and tools that know about messages; why > keep > trying to reinvent them over HTTP? because once you leave the niche of dealing with your fellow geeks, that's what users are going to want. browser access. NNTP is simply a non-issue any more, and iMap is fine, but they know how to go to a URL, don't assume they can reconfigure their mailer. Not saying don't do this, but if you write geek tools for geeks, you'll lose the rest of your audience, the non-technical users. > Nobody needs web access: what they need is > access via a web browser. With browsers that understand NNTP and IMAP > prevalent, and with a wide selection of web-mail and web-news gateways > for the cases where that doesn't work, this is sufficient. is it? it seems to me to (frankly) be a real hack with bad navigation, at least the stuff I've seen. I'd be happy to be proven wrong. > 2. many people have IMAP software. and in many cases, it's set up by someone else, and they have no clue how to tweak it on their own, or interest. And for intermittent or one-time access to an archive? won't bother. And how does it get into google so they know to look at it in the first place? I'm not really thrilled with this avenue. sorry. From kmccann at bellanet.org Tue Oct 28 16:20:29 2003 From: kmccann at bellanet.org (Kevin McCann) Date: Tue Oct 28 16:22:21 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: <20031028203036.GH5392@dust.uchicago.edu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> <20031028203036.GH5392@dust.uchicago.edu> Message-ID: <1067376029.16085.252.camel@localhost.localdomain> > That's pretty much the ideological basis for what I have done. We have > message-delivery protocols, and tools that know about messages; why keep > trying to reinvent them over HTTP? There is a huge demand for web applications that use mailing list data. Mailing list archives in easily accessibly databases will lead to killer community-building apps that *build* on the mailing list archives but offer other resources. NNTP access is fine, go ahead. And IMAP all you want. But I really hope that the Mailman development community does not dismiss the *very strong desire* for flexible web scripting access to the goods. As far as I'm concerned, this is the only thing that's really holding Mailman back from being the tour de force product that it could be. I feel like I'm beating a dead horse, and I apologize if I'm being a pain-in-the-ass with this, but I think it's important. - Kevin From claw at kanga.nu Tue Oct 28 16:33:04 2003 From: claw at kanga.nu (J C Lawrence) Date: Tue Oct 28 16:39:49 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: Message from David Champion of "Tue, 28 Oct 2003 14:30:36 CST." <20031028203036.GH5392@dust.uchicago.edu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> <20031028203036.GH5392@dust.uchicago.edu> Message-ID: <20097.1067376784@kanga.nu> On Tue, 28 Oct 2003 14:30:36 -0600 David Champion wrote: > With browsers that understand NNTP and IMAP prevalent, and with a wide > selection of web-mail and web-news gateways for the cases where that > doesn't work, this is sufficient. A minor problem with this is that news: URLs can't specify a server. However there is no UR standard for IMAP folders. > I favor IMAP over NNTP for this: ... > But integrating with both is even better. Setting up either, or in fact any other form of store (eg SQL) is trivial, and can be as simple as a procmail recpipe or other filter hung off a process pipe. There's nothing unique or special about netnews servers or IMAP or POP or SQL, or whatever in this regard. What is special boils down to two points: 1) What is the primary retrieval key for the messages in the archive, and can Mailman know what the key for a given message is before submitting it to the store? 2) How can the messages in the store be otherwise indexed (eg full text search). #1 is the kicker. #2 is easily abstracted into any method or tool you want. IMAP is a message store with a single store-specific primary retrieval key. There is no standard method for knowing what that key is prior to inserting the message. NNTP-backed systems are also a message store, except that they support two primary retrieval keys, one per-store specific (message number) and one per-message specific (Message-ID). The Message-ID is of course known prior to insertion of a message into the store. Both these described characteristics are constant across all RFC-conforming netnews and IMAP systems. Some IMAP systems support in-band searching (cf Cyrus). This can't be relied on. There are however dozens if not scores of indexing systems which will index netnews spools, mail folders, or even HTML representations of new news spools, mail folders, etc. There's no reason to not leverage that wealth of capability. Search isn't and arguably shouldn't be Mailman's space. We can do something here, but it is not a core market or skill for the product. If Mailman had two things we'd seem to be 90% of the way there: 1) A default message store which also had a very trivial search capability. 2) The ability to pass arguments to a process stating the unique primary key of the message Mailman just submitted to the store (default or otherwise) so that the "search engine" could then index it. Twisted can provide a simple default message store based on netnews. Those interested can trivially use the current gating support to use other netnews systems ala inn2, CNews, etc instead should they wish. or nothing at all. MeoWWW provides a reasonably well featured NNTP-based newsgroup browser with full posting-via-web support. As a GPL pythonic CGI it is relatively trivial to incorporate it in Mailman. Search is a bit more of a mess. I'm not aware of any pythonic trivially integrated search tools ripe for the plucking. I like We:Search as it is fast, braindeadly simple, and has almost no dependencies. It is however also optimised for extremely large mail stores, which isn't Mailman's target, but that also doesn't hurt it. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Tue Oct 28 16:46:02 2003 From: claw at kanga.nu (J C Lawrence) Date: Tue Oct 28 16:46:14 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: Message from Chuq Von Rospach of "Tue, 28 Oct 2003 12:48:46 PST." <20485453-0988-11D8-A02B-0003934516A8@plaidworks.com> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> <20031028203036.GH5392@dust.uchicago.edu> <20485453-0988-11D8-A02B-0003934516A8@plaidworks.com> Message-ID: <2726.1067377562@kanga.nu> On Tue, 28 Oct 2003 12:48:46 -0800 Chuq Von Rospach wrote: > On Oct 28, 2003, at 12:30 PM, David Champion wrote: >> message-delivery protocols, and tools that know about messages; why >> keep trying to reinvent them over HTTP? > because once you leave the niche of dealing with your fellow geeks, > that's what users are going to want. browser access. NNTP is simply a > non-issue any more, and iMap is fine, but they know how to go to a > URL, don't assume they can reconfigure their mailer. Quite. > Not saying don't do this, but if you write geek tools for geeks, > you'll lose the rest of your audience, the non-technical users. This is where the Twisted+MeoWWW approach seems attractive. If we go for Twisted+MeoWWW we get a flexible message store with a CGI-based web front end which also allows posting via web. The fact that it is netnews based can be, and should be, utterly transparent to the casual user. There's no need in fact for the NNTP interface to be exposed to arbitrary connections. However, should someone wish to setup news access to their lists or archives, that's as trivial as telling Mailman to open the port. If they should wish to use their already configured inn2 or whatever, that's as trivial as telling Mailman to deliver to inn2 instead of the Twisted Netnews server. But in the average case they don't have to, and don't have to care. Mailman just uses the netnews base of Twisted as a message store with a known-ion-advance primary retrieval key. >> Nobody needs web access: what they need is access via a web >> browser. With browsers that understand NNTP and IMAP prevalent, and >> with a wide selection of web-mail and web-news gateways for the cases >> where that doesn't work, this is sufficient. > is it? it seems to me to (frankly) be a real hack with bad navigation, > at least the stuff I've seen. I'd be happy to be proven wrong. That's my interpretation as well. It serves some very narrow cases well, but not the general case. > And for intermittent or one-time access to an archive? won't > bother. And how does it get into google so they know to look at it in > the first place? That's where I like the MeoWWW approach. Its just another CGI, so it auto-installs as part of the Mailman CGIs without requiring specific SysAdm configuration. As it renders to HTML/HTTP Google will index it happily. If the list/group is configured to allow posting, then MeoWWW will happily provide a web-based way for arbitrary users coming into your archives via (say) Google search to post and respond to items in a list's archives. Of course standard poster/spam controls would/could be applied. While I see your concern for spending a whole lot of time investing and specialising in a netnews core for Mailman, I think the fear is displaced. Mailman already bidirectionally gates to netnews. The changes required would be comparatively small for a significant gain in feature-set for the average non-geek case (better web archives, posting from web archives, more flexible message store, ability to have archive URL placed in broadcast messages, etc). -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Tue Oct 28 17:06:45 2003 From: claw at kanga.nu (J C Lawrence) Date: Tue Oct 28 17:06:52 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: Message from J C Lawrence of "Tue, 28 Oct 2003 16:46:02 EST." <2726.1067377562@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> <20031028203036.GH5392@dust.uchicago.edu> <20485453-0988-11D8-A02B-0003934516A8@plaidworks.com> <2726.1067377562@kanga.nu> Message-ID: <27346.1067378805@kanga.nu> On Tue, 28 Oct 2003 16:46:02 -0500 J C Lawrence wrote: > While I see your concern for spending a whole lot of time investing > and specialising in a netnews core for Mailman, I think the fear is > displaced. Mailman already bidirectionally gates to netnews. The > changes required would be comparatively small for a significant gain > in feature-set for the average non-geek case (better web archives, > posting from web archives, more flexible message store, ability to > have archive URL placed in broadcast messages, etc). I wrote this badly. I'm looking at this in a rather abstract light, and haven't really stated that fact. From here this is all a question of abstraction models. Currently Mailman doesn't have a clean or well defined abstraction model among the list processor, the archives, the message sort, the web presentation layers, or search engine supports. Gaining a clean and well defined abstraction layer that does the Right Things would do a whole bunch of useful things, like allow for far easier and cleaner plugin-style approaches to any of the core components: Message store Search support Web archives. Message access. Right now those four are a somewhat incestuous mess and pulling them apart into a clean model is a bitch for an external integrator. Harking back to the earlier process queues model for Mailman 3.0, the same abstraction and division gains happen again. We can implement (say) Twisted as a trivial message store. Really what we're doing however is stating that Mailman can use any store you want as long as messages can be sent to it via process pipe, SMTP, or NNTP. We can implement a CGI-based web interface to that NNTP store. Really what we're saying however is that Mailman will provide a web-based interface for archive browsing and message posting to its own default message store, or any NNTP-based message store you implement. We can implement a plugin layer for external search engines as we no know (control) the unique primary retrieval key for messages in the default store (and thus the default URL etc). Or, if you wish, what we're really saying is that Mailman's archives can be integrated with any external search engine that can be handed the retrieval key and can index the message appropriately (via web scrape, custom store access, NNTP, whatever). Sure, we can grow up and feature-fluff the search side later. I'm a little less concerned with that at this point. I'm more concerned with gaining a clean abstraction model among the store/retrieve/present components of the archives. Sure, we could also decompose the messages into SQL structures. Having done this, I'll briefly note that it is not a trivial task on several scores (MIME, data partitioning, etc). This isn't to say that its horrible, merely not trivial. Picking up Twisted's store is about as close as you can get to trivial. You don't get all the MIME decomposition etc that you might get with an SQL store, but having adopted the abstraction model it is then much simpler for someone to write an SQL-based message store that Mailman can talk to cleanly In an OpenSource world smaller shorter feedback loops based on simpler iterative steps are almost always the better choice. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From dgc at uchicago.edu Tue Oct 28 17:30:21 2003 From: dgc at uchicago.edu (David Champion) Date: Tue Oct 28 17:30:54 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: <20485453-0988-11D8-A02B-0003934516A8@plaidworks.com> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> <20031028203036.GH5392@dust.uchicago.edu> <20485453-0988-11D8-A02B-0003934516A8@plaidworks.com> Message-ID: <20031028223021.GI5392@dust.uchicago.edu> * On 2003.10.28, in <20485453-0988-11D8-A02B-0003934516A8@plaidworks.com>, * "Chuq Von Rospach" wrote: > > because once you leave the niche of dealing with your fellow geeks, > that's what users are going to want. browser access. NNTP is simply a It's the "non-geeks" I'm trying to help: I support 25,000 of them, and I really don't worry much about the "geeks". They know how to do for themselves. > non-issue any more, and iMap is fine, but they know how to go to a URL, > don't assume they can reconfigure their mailer. Where I work -- and I know that it's not like this everywhere, but I have to assume we're not the only place like this -- we configure users' mailers for them initially. (So we can configure in access to our list server(s).) We have a telephone support line that regularly works people through mailer issues. Here, reconfiguring a mailer is not a hard problem, compared to getting usable HTML archives in a supportable server configuration. > Not saying don't do this, but if you write geek tools for geeks, you'll > lose the rest of your audience, the non-technical users. Agreed, but I don't think I'm proposing "geek tools". I'm trying to establish a shared pathway for getting into a message archive that lets geeks use their tools, and non-geeks use theirs, equally. > >Nobody needs web access: what they need is > >access via a web browser. With browsers that understand NNTP and IMAP > >prevalent, and with a wide selection of web-mail and web-news gateways > >for the cases where that doesn't work, this is sufficient. > > is it? it seems to me to (frankly) be a real hack with bad navigation, > at least the stuff I've seen. I'd be happy to be proven wrong. What seems to have bad navigation? I'm not sure what component you mean. I would say that webmail programs generally are awful, but I know that 40% of my users love using them. I think it's also relevant that every web-based list archive I've ever used is atrocious for navigation; their only selling points seem to be ease of referral and indexing. (And yes, I agree that these are important elements.) But granted, this is an overzealous assessment. I should say: IMAP and NNTP access are sufficient for certain environments of which I believe mine is an example. > And for intermittent or one-time access to an archive? won't bother. > And how does it get into google so they know to look at it in the first > place? Again, I'm not talking (for the most part) about public-access lists. I'm talking about private communities consisting mostly of people within a common real-world context. Perhaps I should have made that more clear. This happens a lot: I seriously doubt that most mailing lists, even most Mailman mailing lists, are public. > I'm not really thrilled with this avenue. sorry. Don't be sorry. I want to google certain lists as much as the next person, and I know that this model doesn't work as well as HTML archives in that respect (though I will note as a sidebar that Google happily indexes NNTP servers). I'm not trying to kill the web archive go before its time, and nothing I've proposed obviates having one. All I've described is a parallel mode of access that I believe is more appropriate and more useful in some settings. We're already plugging external archivers into Mailman now, and nothing in this suggestion prevents us from continuing to do that. The only potential change, I would say, is that in one design, archivers would pull from NNTP, IMAP, or a message store, rather than actively being fed articles. I don't particularly advocate that, lacking a better understanding of the internals of the list server. I take no issue with leaving in a means of delivering messages to archivers, I just would like to see it become one of several access messsage channels, preferably all with some shared interface to the core processor. -- -D. dgc@uchicago.edu University of Chicago > NSIT > VDN > ENSS > ENSA > You are here . . . . . . . always line up dots From brad.knowles at skynet.be Wed Oct 29 10:13:38 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 10:49:40 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067285190.16085.79.camel@localhost.localdomain> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> Message-ID: At 3:06 PM -0500 2003/10/27, Kevin McCann wrote: > I was thinking about using MHonarc to enhance the archive experience but > it doesn't work with MySQL directly so Mail::Box just might be what the > doctor ordered. No database handles "BLOB" (Binary Large OBject) storage well. Even high-end databases have problems in this area. IMO, this is a bad idea. Better would be to use a mailbox format that handles simultaneous multiple access reasonably well. You can use c-client and mbx format, or MH format, or something else reasonably decent. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From brad.knowles at skynet.be Wed Oct 29 10:31:53 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 10:49:47 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: <30982.1067320406@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> <21426.1067319883@kanga.nu> <30982.1067320406@kanga.nu> Message-ID: At 12:53 AM -0500 2003/10/28, J C Lawrence wrote: > For me that came out to inn2, Message-IDs, We:Search, and a mix of a > MeoWWW derivative (HTTP) and NNTP straight to inn2. The critical point > is that each bit of that amalgamation is independent and can be > trivially swapped out for something else that suits a local use case > better. One of the hallmarks of mailman is that it provides almost all the features anyone would be likely to want, out-of-the-box. Maybe they're not the 100% ideal solution, but they're usually more than sufficient. You get a decent archive solution provided "for free", as part of the base package. If you throw that out and go with a modular approach based on any number of external third-party add-in tools, then you might as well go back to Majordomo. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From brad.knowles at skynet.be Wed Oct 29 10:29:09 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 10:49:51 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <17914.1067319714@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <17914.1067319714@kanga.nu> Message-ID: At 12:41 AM -0500 2003/10/28, J C Lawrence wrote: > Quite, this is how/why NNTP uses Message-IDs are unique indexing > qualifiers. Problem is that client-assigned message-ids are not guaranteed unique. Too many people are using RFC 1918 private addressing space, and if the machine doesn't know it's own name, then it stuffs in just the IP address for that portion. Everything else could quite feasibly collide, and you'd wind up with multiple non-unique message-ids. You need a guaranteed unique id to be used as a primary index field. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From brad.knowles at skynet.be Wed Oct 29 10:16:02 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 10:49:54 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067285548.1785.62.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> Message-ID: At 3:12 PM -0500 2003/10/27, Barry Warsaw wrote: > What would then be in the database would be records providing easy > lookup by message-id (at least) into the on-disk message store. Putting meta-data into the database would work. Then use that index information to actually access the files. I recommended the same in my invited talk at . Of course, if you're going to use a USENET interface, you should use Diablo as the back-end. ;-) -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From brad.knowles at skynet.be Wed Oct 29 10:49:14 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 10:49:57 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: <20031028203036.GH5392@dust.uchicago.edu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> <20031028203036.GH5392@dust.uchicago.edu> Message-ID: At 2:30 PM -0600 2003/10/28, David Champion wrote: > Nobody needs web access: what they need is > access via a web browser. With browsers that understand NNTP and IMAP > prevalent, and with a wide selection of web-mail and web-news gateways > for the cases where that doesn't work, this is sufficient. You can't assume a homogenous client mix, one where a single program does everything. There are way more phone users than there are computer users, and the number of mobile phones in a growing number of countries exceeds the number of fixed lines. Mobile access to the web will be the next killer app. However, most of those phones might have some sort of a browser, but e-mail support would be from a separate program, and USENET news clients would be non-existent. You cannot assume a homogenous client mix. Moreover, you can't assume broad support for less common protocols like IMAP or NNTP. > 2. many people have IMAP software. Fewer have or understand how to use > NNTP software. Many more people have access to some sort of web browser than they do some sort of IMAP client. If you're going to do lowest-common-denominator, then IMAP loses. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From claw at kanga.nu Wed Oct 29 11:48:59 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 11:49:07 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Brad Knowles of "Wed, 29 Oct 2003 16:29:09 +0100." References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <17914.1067319714@kanga.nu> Message-ID: <25767.1067446139@kanga.nu> On Wed, 29 Oct 2003 16:29:09 +0100 Brad Knowles wrote: > At 12:41 AM -0500 2003/10/28, J C Lawrence wrote: >> Quite, this is how/why NNTP uses Message-IDs are unique indexing >> qualifiers. > Problem is that client-assigned message-ids are not guaranteed > unique. Right, and that was the point. If we do nothing to Message IDs we don't change external behaviour. If we use a netnews backing store for the archives and we don't dick with the message IDs we run the risk of some messages never reaching the archives. If we use a netnews backing store and dick with message IDs we can offer various levels of guarantee that messages reach the archives, and of pissing off users because we messed with the Message IDs. As always, you get to pick. > Everything else could quite feasibly collide, and you'd wind up with > multiple non-unique message-ids. In which case the many people currently using ID-based dupe collapsing (eg default Exchange config) will lose messages, and the archives will lose messages....OR...we offer some level of guarantee (see yesterday's discussion) with the matching trade-offs. > You need a guaranteed unique id to be used as a primary index field. "Need" is a strong word. Its very deployment and use-case sensitive. There are a large number of cases where I'm content to rest on the assurance that the Message IDs arriving at my lists will always be unique. There are also a large number of cases where I'm not willing to make that assessment, as well as a large number of cases where I'm willing to simply discard anu duplicated Message ID messages at the archiver level. Similarly, there are cases where re-writing the Message IDs in any form is significantly troubling, and cases where its not. "Need"? No. It is a deployment choice with easily understood ramifications. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Wed Oct 29 11:51:12 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 11:51:16 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: Message from Brad Knowles of "Wed, 29 Oct 2003 16:31:53 +0100." References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> <21426.1067319883@kanga.nu> <30982.1067320406@kanga.nu> Message-ID: <28446.1067446272@kanga.nu> On Wed, 29 Oct 2003 16:31:53 +0100 Brad Knowles wrote: > At 12:53 AM -0500 2003/10/28, J C Lawrence wrote: >> For me that came out to inn2, Message-IDs, We:Search, and a mix of a >> MeoWWW derivative (HTTP) and NNTP straight to inn2. The critical >> point is that each bit of that amalgamation is independent and can be >> trivially swapped out for something else that suits a local use case >> better. > One of the hallmarks of mailman is that it provides almost all the > features anyone would be likely to want, out-of-the-box. Maybe > they're not the 100% ideal solution, but they're usually more than > sufficient. You get a decent archive solution provided "for free", as > part of the base package. You're making my point for me again. By picking up the abstraction model I've suggested and tucking Twisted and MeoWWW under Mailman we do get the 80% solution for bundled with Mailman. But we also get an easy abstraction model where any one of the pieces can be trivially replaced by something else, like the inn2 I mention. > If you throw that out and go with a modular approach based on any > number of external third-party add-in tools, then you might as well go > back to Majordomo. Which hasn't been proposed, even slightly. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Wed Oct 29 12:45:10 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 12:45:47 2003 Subject: [Mailman-Developers] Re: Requirements for a new archiver In-Reply-To: Message from Brad Knowles of "Wed, 29 Oct 2003 16:49:14 +0100." References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <1067285548.1785.62.camel@anthem> <20031028000247.GD5392@dust.uchicago.edu> <1067300194.1066.39.camel@anthem> <20031028203036.GH5392@dust.uchicago.edu> Message-ID: <25394.1067449510@kanga.nu> On Wed, 29 Oct 2003 16:49:14 +0100 Brad Knowles wrote: > At 2:30 PM -0600 2003/10/28, David Champion wrote: > You cannot assume a homogenous client mix. Moreover, you can't assume > broad support for less common protocols like IMAP or NNTP. Apparently assumable/desirable protocols enclude: SMTP HTTP XML/RPC (on HTTP) SOAP (on HTTP or SMTP) Mailman currently supports the first two, with the caveat that it has no SMTP retrieval supports and no a priori primary key for HTTP. Moving to a store which supports a Message ID primary key doesn't change the protocol list (tho it may extend it). However moving to such a store allows a simple extension to allow key-based retrieval via SMTP and HTTP, which are the primary protocols. Extending that down the road to XML/RPC and SOAP on any transport wouldn't be difficult, especially in the Python world (I'm beating on SOAPpy's MIME supports as I type this). What form the backing store takes is orthogonal to this aspect of the discussion. The key features are a priori key definition and key-based retrieval. Get those two and the rest become relatively trivial. The exact form of the backing store is irrelevant. Nobody cares. Protocol access and protocol behaviour (API) are the important bits. To date three backing stores have been proposed: Twisted's NNTP implementation, IMAP, and SQL. All could work. All could match the above discussion perfectly. Implementing each would require significantly different levels of effort. I'd posit that Twisted is the cheapest/easiest route simply due to it being pythonic and small. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From brad.knowles at skynet.be Wed Oct 29 12:41:20 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 13:11:48 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <25767.1067446139@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <17914.1067319714@kanga.nu> <25767.1067446139@kanga.nu> Message-ID: At 11:48 AM -0500 2003/10/29, J C Lawrence wrote: >> You need a guaranteed unique id to be used as a primary index field. > > "Need" is a strong word. Its very deployment and use-case sensitive. In the case of a database, it is a hard requirement. A primary index field must be guaranteed unique. There is absolutely no way around this issue. > "Need"? No. It is a deployment choice with easily understood > ramifications. Perhaps for the application, but this is a totally different ballgame when it comes to a database. Google for "primary index field", and hopefully you will understand. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From kmccann at bellanet.org Wed Oct 29 13:20:20 2003 From: kmccann at bellanet.org (Kevin McCann) Date: Wed Oct 29 13:21:05 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> Message-ID: <1067451620.11393.49.camel@localhost.localdomain> On Wed, 2003-10-29 at 10:13, Brad Knowles wrote: > At 3:06 PM -0500 2003/10/27, Kevin McCann wrote: > > > I was thinking about using MHonarc to enhance the archive experience but > > it doesn't work with MySQL directly so Mail::Box just might be what the > > doctor ordered. > > No database handles "BLOB" (Binary Large OBject) storage well. > Even high-end databases have problems in this area. IMO, this is a > bad idea. Agreed. I was thinking more along the lines of storing the message body as is, which, yes, might sometimes be base-64 encoded. Content headers, boundary string, etc. could also be stored so as to make decoding (by a web app) a cinch. You could go further and create attachment files and point to it in an url or file field. But keep the message intact, as it was received. That way if you want to get into after-the-fact message delivery (manual resend, or maybe a member missed a message and wants it in his/her inbox), it's not a chore. The Messages_ table that Lyris uses in its database is a good starting point if one wants to do the same kind of thing. I can dig up the specs if there is interest. - Kevin From jam at jamux.com Wed Oct 29 13:28:56 2003 From: jam at jamux.com (John A. Martin) Date: Wed Oct 29 13:29:06 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: (Brad Knowles's message of "Wed, 29 Oct 2003 16:13:38 +0100") References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> Message-ID: <87d6cfdhjr.fsf@athene.jamux.com> Brad Knowles writes: > At 3:06 PM -0500 2003/10/27, Kevin McCann wrote: > >> I was thinking about using MHonarc to enhance the archive experience but >> it doesn't work with MySQL directly so Mail::Box just might be what the >> doctor ordered. > > No database handles "BLOB" (Binary Large OBject) storage > well. Even high-end databases have problems in this area. > IMO, this is a bad idea. > > Better would be to use a mailbox format that handles > simultaneous multiple access reasonably well. You can use > c-client and mbx format, or MH format, or something else > reasonably decent. Hmm... Maildirs. With just a bit of minor trickery the unique filename created to receive a message as it arrives at Mailman might be put into the saved rfc822 header (much like MTAs place a queue id), or into the message trailer if you must, and perhaps could be preserved in the filename as the message is moved/copied from one directory to another and thereby providing a unique index that can be included in the message Mailman puts on the wire. jam -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 154 bytes Desc: not available Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031029/78b5cf21/attachment.bin From chuqui at plaidworks.com Wed Oct 29 13:30:35 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Wed Oct 29 13:32:15 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <17914.1067319714@kanga.nu> <25767.1067446139@kanga.nu> Message-ID: On Oct 29, 2003, at 9:41 AM, Brad Knowles wrote: > In the case of a database, it is a hard requirement. A primary index > field must be guaranteed unique. There is absolutely no way around > this issue. which is why it many times makes sense to generate your own. Consider, say, identifying all messages with an MD5 hash of the message.... then use that for all of your link generating and access work. From claw at kanga.nu Wed Oct 29 13:30:49 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 13:32:17 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Brad Knowles of "Wed, 29 Oct 2003 18:41:20 +0100." References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <17914.1067319714@kanga.nu> <25767.1067446139@kanga.nu> Message-ID: <13832.1067452249@kanga.nu> On Wed, 29 Oct 2003 18:41:20 +0100 Brad Knowles wrote: > At 11:48 AM -0500 2003/10/29, J C Lawrence wrote: >>> You need a guaranteed unique id to be used as a primary index field. >> "Need" is a strong word. Its very deployment and use-case sensitive. > In the case of a database, it is a hard requirement. A primary index > field must be guaranteed unique. There is absolutely no way around > this issue. Right, and I'm not arguing that. My point is two fold: 1) Using Message ID as a primary key is attractive. 2) Message IDs are not guaranteed globally unique, but the collision rate can be manageable/acceptable in a large number of deployment cases. We don't have to guarantee key uniqueness for all messages BEFORE they are submitted to the message store. The unique property can be assumed from external sources (with all that implies) should the deployment case want that. There are tradeoffs here, and it is not clear to me that there is an instant and obvious global solution. >> "Need"? No. It is a deployment choice with easily understood >> ramifications. > Perhaps for the application, but this is a totally different ballgame > when it comes to a database. Google for "primary index field", and > hopefully you will understand. I'm neither an idiot or a neophyte in this game. Yes, a database needs a primary unique key. That's not in debate. The questions are: Do we know the key before submission to the store? (If we don't the store operation shouldn't be asynchronous) Is the risk of discarded messages due to key collisions acceptable? (Some deployment cases consider such losses acceptable, others can guarantee uniqueness without Mailman's involvement) Rotely assuming that Mailman must guarantee key uniqueness before we hit the message store is not a given, its a choice. Let's at least be on the same page. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From brad.knowles at skynet.be Wed Oct 29 13:45:53 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 14:27:50 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <87d6cfdhjr.fsf@athene.jamux.com> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> Message-ID: At 1:28 PM -0500 2003/10/29, John A. Martin wrote: > Hmm... Maildirs. Not. From : . mh This is supported for compatibility with the past. This is the format used by the old mh program. mh is very inefficient; the entire directory must be read and each file stat()'d, and in order to determine the size of a message, the entire file must be read and newline conversion performed. mh is deficient in that it does not support any permanent flags or keywords; and has no means to store UIDs (because the mh "compress" command renames all the files, that's why). [ ... deletia ... ] The Maildir format used by qmail has all of the performance disadvantages of mh noted above, with the additional problem that the files are renamed in order to change their status so you end up having to rescan the directory frequently the current names (particularly in a shared mailbox scenario). It doesn't scale, and it represents a support nightmare; [ ... deletia ... ] So what does this all mean? A database (such as used by Exchange) is really a much better approach if you want to move away from flat files. mx and especially Cyrus take a tenative step in that direction; mx failed mostly because it didn't go anywhere near far enough. Cyrus goes much further, and scores remarkable benefits from doing so. However, a well-designed pure database without the overhead of separate files would do even better. Of course, we all know about the database problems of Exchange, and how Exchange admins have to frequently shut everything down and clean their databases, how often they crash, how often they completely trash all e-mail for all their users, etc.... I submit that the reason for this is the combination of crappy Microsoft-style programming and the fact that no database handles BLOBs well. Even top-notch programmers have real problems with these kinds of implementations -- I am intimately familiar with the database implementation methods used in the AOL mail system, and suffice it to say that this is a really, really hairy nightmare that you do *NOT* want. That said, storing meta-data in a real database and then using external filesystem techniques for actually accessing the data, should give you the best of both worlds -- the speed of access of the database, and the reliability and well-understood access and backup mechanisms of filesystems. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From brad.knowles at skynet.be Wed Oct 29 14:11:49 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 14:27:55 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <13832.1067452249@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <17914.1067319714@kanga.nu> <25767.1067446139@kanga.nu> <13832.1067452249@kanga.nu> Message-ID: At 1:30 PM -0500 2003/10/29, J C Lawrence wrote: > Right, and I'm not arguing that. My point is two fold: > > 1) Using Message ID as a primary key is attractive. Agreed. > 2) Message IDs are not guaranteed globally unique, but the collision > rate can be manageable/acceptable in a large number of deployment > cases. Outside of a database, this may be something you can decide whether or not to live with. Within the confines of a database, this simply is not possible. The ANSI SQL specification has some hard requirements for a primary index key: 1. It cannot ever be null. 2. It must always be guaranteed unique. I'm sure there are other requirements. But these two are a good start. > We don't have to guarantee key uniqueness for all messages BEFORE they > are submitted to the message store. All other keys could potentially be non-unique, or null, but not the primary index key. This is why many applications have the database assign the primary index key itself on insertion into the table, so that all the necessary requirements can be met. > I'm neither an idiot or a neophyte in this game. Yes, a database needs > a primary unique key. Then you must realize that we could not possibly use message-id as the primary index key, unless this is a field that we generate ourselves in such a way that all the necessary requirements are met. > Rotely assuming that Mailman must guarantee key uniqueness before we hit > the message store is not a given, its a choice. The message-id is not necessarily the primary index key. See above. With regards to a primary index key, there simply is no choice. The message-id could continue to be one of the many secondary index keys, which is a totally different issue. > Let's at least be on the same page. Agreed. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From chuqui at plaidworks.com Wed Oct 29 14:38:33 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Wed Oct 29 14:38:43 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> Message-ID: <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> On Oct 29, 2003, at 10:45 AM, Brad Knowles wrote: > That said, storing meta-data in a real database and then using > external filesystem techniques for actually accessing the data, should > give you the best of both worlds -- the speed of access of the > database, and the reliability and well-understood access and backup > mechanisms of filesystems. > Hint: look at what INN did when they implmented cycbufs. Effectively, you create 1-N files, or create files as needed. Each file is N bytes long, pre-allocated on file creation. When you store messages, they're written into the file sequentially (or any other way you want. If you want to get into best fit allocations and turn this into a malloc() style heap, be my guest). Metadata to access the info is then a filename, and an lseek() pointer into the file, and # of bytes to read, plus your normal identifying info. It's fast, it's efficient use of file pointers, it avoids the worst aspects of the unix file system, and I'm amazed nobody ever thinks to use it for other purposes (or that it took that long for usenet people to discover it, I suggested a simpler variant of it back in the 80s and was told inodes are our friends...) you can even do expiration/purge/etc if you want, by moving stuff around and changing the pointers. I've even thought of using it as the backing store for a picture library. With a nice relational database and a series of these "data boxes", I think you have store data in the best and fastest possible way... From spacey-mailman at lenin.nu Wed Oct 29 14:54:20 2003 From: spacey-mailman at lenin.nu (Peter C. Norton) Date: Wed Oct 29 14:54:25 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> Message-ID: <20031029195420.GI24088@lenin.nu> On Wed, Oct 29, 2003 at 07:45:53PM +0100, Brad Knowles wrote: > At 1:28 PM -0500 2003/10/29, John A. Martin wrote: > > > Hmm... Maildirs. > > Not. > > From > : [deletia] I don't know why a reasonable person would cite documentation pertaining to UW-IMAP, a server that has been a standards, security and performance bummer. Why not cite http://www.courier-mta.org/mbox-vs-maildir/? Painting "just about" every filesystem in existence with the same brush, and assuming that every filesystem works pretty much in the same way, is very misleading. Many contemporary high performance filesystem are designed explicitly for parallel access. For example, consider the SGI XFS filesystem: The free space and inodes within each AG are managed independently and in parallel so multiple processes can allocate free space throughout the file system simultaneously.[2] It took me about 6 months to write the first revision of the maildir-based Courier-IMAP server. The absence of maildir support in the UW-IMAP server is the reason I wrote it. Many people have found that it needed less memory, and was faster than UW-IMAP. Many people observed that upgrading to Courier-IMAP lowered their overall system load, and increased performance. Large mail clusters with a network-based fault tolerant, scalable, architecture frequently have problem deploying mbox-based mailboxes, due to many documented problems with file locking (file locking is required for mbox-based mailboxes) with network-based filesystems.[3] As referenced in [3], maildirs have no issues with NFS (the most common type of a network-based filesystem) since maildirs do not use locking. After looking around for some time, I did not find any independent benchmarks that directly measured the relative performance of mboxes and maildirs. Therefore I decided to run some actual benchmarks myself. I defined the test conditions according to UW-IMAP server's documentation. I created a test environment that stacked the deck in favor of mboxes. This was done in accordance with the claimed shortcomings of maildirs as stated in UW-IMAP server's documentation, in order to accurately measure the magnitude of the claimed problems. and at the end: The final conclusion is that -- except in some specific instances -- using maildirs will be just as fast -- and in sometimes much faster -- than mbox files, while placing less of a load on the rest of the mail system. The claims in the UW-IMAP server's documentation regarding maildir performance can be supported only in certain, specific, very narrowly-defined conditions. There is no simple answer on which mail storage format is better. A lot depends on many variables that vary widely in different situations. Besides the raw benchmarks shown above, other factors include the mail server software being used, what kind of storage is being used, and the available network bandwidth. The final answer depends on all of the above. [flame-bait deleted] > A database (such as used by Exchange) is really a much better > approach if you want to move away from flat files. mx and especially > Cyrus take a tenative step in that direction; mx failed mostly because > it didn't go anywhere near far enough. Cyrus goes much further, and > scores remarkable benefits from doing so. > > However, a well-designed pure database without the overhead of > separate files would do even better. It always confounds me that people will go for database voodoo and deride filesystems when a filesystem is a highly specialised database in and of itself. Putting things that are in a filesystem into a database offers the power and flexability of querying, but certianly should not be done for the sake of speed (assuming the filesystem-based implementation meets whatever other requirements are present). > Of course, we all know about the database problems of Exchange, > and how Exchange admins have to frequently shut everything down and > clean their databases, how often they crash, how often they > completely trash all e-mail for all their users, etc.... Which is a good lesson about databases: because of their flexability, they cannot be qa'd to cope with all of their uses without being put into production and losing data and being subsequently fixed. Filesystems, which have a more narrowly-defined scope, tend to suffer this less. Thats why database logs that live on filesystems are used for data recovery when a database eats itself. > I submit that the reason for this is the combination of crappy > Microsoft-style programming and the fact that no database handles > BLOBs well. Even top-notch programmers have real problems with these > kinds of implementations -- I am intimately familiar with the > database implementation methods used in the AOL mail system, and > suffice it to say that this is a really, really hairy nightmare that > you do *NOT* want. Databases aren't meant to be storage for abstract binary data. They're meant to be a searchable index of data of types they understand. Assuming I had a clean slate to start a database project for a mail store, personally I'd much rather prototype it in something like postgresql where I could add data types to deal with email. I could then make header types, text types, mime types classes, etc. Then I could test to see if it was a good idea to implement it. > That said, storing meta-data in a real database and then using > external filesystem techniques for actually accessing the data, > should give you the best of both worlds -- the speed of access of the > database, and the reliability and well-understood access and backup > mechanisms of filesystems. I think using a standard sql database for doing mail operations is asking for trouble. Standard databases don't know how to parse rfc822/2822 headers and that means that you've got to either write a whole lot of stored procedures in a clunky query language (or java!?!?!) and then maintain it, or you've got to do it all in the imap/pop3/whatever server which means a whole lot of yammering traffic between the database and the I/P/W server all the time, which == slow. -Peter -- The 5 year plan: In five years we'll make up another plan. Or just re-use this one. From brad.knowles at skynet.be Wed Oct 29 15:15:13 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 15:27:19 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> Message-ID: At 11:38 AM -0800 2003/10/29, Chuq Von Rospach wrote: > Hint: look at what INN did when they implmented cycbufs. I did. See . -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From brad.knowles at skynet.be Wed Oct 29 15:25:53 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 15:27:25 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <20031029195420.GI24088@lenin.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> Message-ID: At 11:54 AM -0800 2003/10/29, Peter C. Norton wrote: > It always confounds me that people will go for database voodoo and > deride filesystems when a filesystem is a highly specialised database > in and of itself. I am aware of that. I was aware of that when I first gave my invited talk entitled "Design and Implementation of Highly Scalable E-mail Systems", which you can find at . Note that Eric Allman (author of the original Ingres database, among many other things) and Kirk McKusick (author of the Berkeley Fast File System) were in the audience. I did not embarrass myself. > Databases aren't meant to be storage for abstract binary data. > They're meant to be a searchable index of data of types they > understand. Correct. And despite all claims to the contrary from the vendors, no database properly "understands" binary large objects, nor do they give you another datatype they do actually understand that would be suitable for the storage of e-mail message bodies. > Assuming I had a clean slate to start a database project for a mail > store, personally I'd much rather prototype it in something like > postgresql where I could add data types to deal with email. I could > then make header types, text types, mime types classes, etc. Then I > could test to see if it was a good idea to implement it. IMO, that would be an exercise in futility. We've been down this road a million times before. We don't need to go down it again to know that the result is not likely to be successful, especially when we have alternatives that are proven to work well -- we store the message meta-data in the database, and then the message bodies in an separate message store akin to INN timecaf/timehash "heaps" (see ). > I think using a standard sql database for doing mail operations is > asking for trouble. Standard databases don't know how to parse > rfc822/2822 headers and that means that you've got to either write a > whole lot of stored procedures in a clunky query language (or > java!?!?!) and then maintain it, or you've got to do it all in the > imap/pop3/whatever server which means a whole lot of yammering traffic > between the database and the I/P/W server all the time, which == slow. You don't ask the database to understand or parse RFC2822 headers or messages. That's up to your application. You just store data using the formats known to the database, and the message bodies according to the methods above. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From spacey-mailman at lenin.nu Wed Oct 29 15:37:07 2003 From: spacey-mailman at lenin.nu (Peter C. Norton) Date: Wed Oct 29 15:37:10 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> Message-ID: <20031029203707.GK24088@lenin.nu> On Wed, Oct 29, 2003 at 09:25:53PM +0100, Brad Knowles wrote: > > Assuming I had a clean slate to start a database project for a mail > > store, personally I'd much rather prototype it in something like > > postgresql where I could add data types to deal with email. I could > > then make header types, text types, mime types classes, etc. Then I > > could test to see if it was a good idea to implement it. > > IMO, that would be an exercise in futility. We've been down this > road a million times before. We don't need to go down it again to > know that the result is not likely to be successful, especially when > we have alternatives that are proven to work well -- we store the > message meta-data in the database, and then the message bodies in an > separate message store akin to INN timecaf/timehash "heaps" (see > ). It seems like you're only partially agreeing/disagreeing with me (optimist/pessamist). Disagreeing: you're saying that using datatypes in the database which are appropriate to the kind of data being stored (mail messages) is an excercise in futility. But, agreeing: that storing these in a database in another way is OK. I don't get why you'd just want to store these as text when you have databases that can be made more suitable to the problem. > > I think using a standard sql database for doing mail operations is > > asking for trouble. Standard databases don't know how to parse > > rfc822/2822 headers and that means that you've got to either write a > > whole lot of stored procedures in a clunky query language (or > > java!?!?!) and then maintain it, or you've got to do it all in the > > imap/pop3/whatever server which means a whole lot of yammering traffic > > between the database and the I/P/W server all the time, which == slow. > > You don't ask the database to understand or parse RFC2822 headers > or messages. That's up to your application. You just store data > using the formats known to the database, and the message bodies > according to the methods above. So all the parsing happens in the database client side. Which is slow. -Peter -- The 5 year plan: In five years we'll make up another plan. Or just re-use this one. From claw at kanga.nu Wed Oct 29 15:41:13 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 15:41:18 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Chuq Von Rospach of "Wed, 29 Oct 2003 11:38:33 PST." <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> Message-ID: <3147.1067460073@kanga.nu> On Wed, 29 Oct 2003 11:38:33 -0800 Chuq Von Rospach wrote: > Hint: look at what INN did when they implmented cycbufs. Aye, its a cute system. > Effectively, you create 1-N files, or create files as needed. Each > file is N bytes long, pre-allocated on file creation. When you store > messages, they're written into the file sequentially (or any other way > you want. If you want to get into best fit allocations and turn this > into a malloc() style heap, be my guest). > Metadata to access the info is then a filename, and an lseek() pointer > into the file, and # of bytes to read, plus your normal identifying > info. It's fast, it's efficient use of file pointers, it avoids the > worst aspects of the unix file system, and I'm amazed nobody ever > thinks to use it for other purposes (or that it took that long for > usenet people to discover it, I suggested a simpler variant of it back > in the 80s and was told inodes are our friends...) Small caveat: Some modern fileystems make operating on the one-file-per-message stores extremely efficient. Admittedly they aren't in wide cross-platform deployment, but the filesystems and file op behaviour of today and yesteryear are not quite the same. > I've even thought of using it as the backing store for a picture > library. With a nice relational database and a series of these "data > boxes", I think you have store data in the best and fastest possible > way... Some years back I talked to Mike Belshe (used to be at Remarq) about their storage techniques (I caught him shortly after Critical Path bought Remarq). Keying off other LISA papers they segmented their storage space by object size, customising and configuring each segment to suit (things like RAID strip size, number of spindles, FS tuning parameters, etc). He asserted that the rewards were very significant. However, these are very large archive problems and are a bit outside of Mailman's home turf. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From davidb at chelsea.net Wed Oct 29 16:05:43 2003 From: davidb at chelsea.net (David Birnbaum) Date: Wed Oct 29 16:09:07 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <3147.1067460073@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> Message-ID: Howdy, A few comments from the peanut gallery: 1. grep, perl, awk, vim, emacs, cp, mv, tar, etc. don't work too well on SQL databases, but are nice for admins who want quick and dirty searching, move mailman to a new machine, or need to poke around. 2. third-party add-ons make it that much harder to install. If I have to set up a Mysql or Postgres database to use Mailman, it's a step that will put off people who don't already have it going. Cheers, David. ----- On Wed, 29 Oct 2003, J C Lawrence wrote: > > On Wed, 29 Oct 2003 11:38:33 -0800 > Chuq Von Rospach wrote: > > > Hint: look at what INN did when they implmented cycbufs. > > Aye, its a cute system. > > > Effectively, you create 1-N files, or create files as needed. Each > > file is N bytes long, pre-allocated on file creation. When you store > > messages, they're written into the file sequentially (or any other way > > you want. If you want to get into best fit allocations and turn this > > into a malloc() style heap, be my guest). > > > Metadata to access the info is then a filename, and an lseek() pointer > > into the file, and # of bytes to read, plus your normal identifying > > info. It's fast, it's efficient use of file pointers, it avoids the > > worst aspects of the unix file system, and I'm amazed nobody ever > > thinks to use it for other purposes (or that it took that long for > > usenet people to discover it, I suggested a simpler variant of it back > > in the 80s and was told inodes are our friends...) > > Small caveat: Some modern fileystems make operating on the > one-file-per-message stores extremely efficient. Admittedly they aren't > in wide cross-platform deployment, but the filesystems and file op > behaviour of today and yesteryear are not quite the same. > > > I've even thought of using it as the backing store for a picture > > library. With a nice relational database and a series of these "data > > boxes", I think you have store data in the best and fastest possible > > way... > > Some years back I talked to Mike Belshe (used to be at Remarq) about > their storage techniques (I caught him shortly after Critical Path > bought Remarq). Keying off other LISA papers they segmented their > storage space by object size, customising and configuring each segment > to suit (things like RAID strip size, number of spindles, FS tuning > parameters, etc). He asserted that the rewards were very significant. > > However, these are very large archive problems and are a bit outside of > Mailman's home turf. > > -- > J C Lawrence > ---------(*) Satan, oscillate my metallic sonatas. > claw@kanga.nu He lived as a devil, eh? > http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. > > _______________________________________________ > Mailman-Developers mailing list > Mailman-Developers@python.org > http://mail.python.org/mailman/listinfo/mailman-developers > > From chuqui at plaidworks.com Wed Oct 29 16:11:01 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Wed Oct 29 16:11:09 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> Message-ID: <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> On Oct 29, 2003, at 1:05 PM, David Birnbaum wrote: > 1. grep, perl, awk, vim, emacs, cp, mv, tar, etc. don't work too well > on > SQL databases, but are nice for admins who want quick and dirty > searching, move mailman to a new machine, or need to poke around. and given how many admins never see anything but the web site, that's a nice thing, but far from an important one. > 2. third-party add-ons make it that much harder to install. If I > have to > set up a Mysql or Postgres database to use Mailman, it's a step > that > will put off people who don't already have it going. > actually, if you do it right, it's much easier -- because when you build in those tools, you build in standardized interfaces that third party add-ons can access, instead of the current case, which are code hacks that break every time Barry burps at the CVS server... From brad.knowles at skynet.be Wed Oct 29 16:14:52 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 16:16:40 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <20031029203707.GK24088@lenin.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> Message-ID: At 12:37 PM -0800 2003/10/29, Peter C. Norton wrote: > It seems like you're only partially agreeing/disagreeing with me > (optimist/pessamist). Disagreeing: you're saying that using datatypes > in the database which are appropriate to the kind of data being stored > (mail messages) is an excercise in futility. Not quite. I believe that there are no databases in existence which have data types that are actually appropriate for the storage of message bodies. > But, agreeing: that > storing these in a database in another way is OK. Not quite. Store meta-data, yes. The entire message, no. Store things like who the message is from, who the message is addressed to, the date, the message-id as it was found in the headers, etc.... Basically, store just about everything in the message headers that a client would be likely to ask about. That's all well and good. But when it comes to storing the message body itself, it should be stored in wire format (i.e., precisely as it came in), in the filesystem. Then pointers to the location in the filesystem should be put into the database. One key factor here is that all of the information in the database should be able to be re-created from the message bodies alone, if there should happen to be a catastrophic system crash. The sole purpose of the database is to speed up access to the messages and the message content -- indeed, to speed it up enough so that randomly accessing most any piece of information about any message from any sender to any recipient in any mailbox should become something feasible to contemplate. The sole purpose of the database is to make the difficult and slow (on the large scale) quick and easy, and to make the things that would be totally impossible (on any reasonable scale) at least something that can now be considered. > I don't get why > you'd just want to store these as text when you have databases that > can be made more suitable to the problem. I don't believe that there are any databases in existence that "... can be made more suitable to the problem." > So all the parsing happens in the database client side. Which is slow. Yup. I don't see any way around that. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From claw at kanga.nu Wed Oct 29 16:54:56 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 16:55:07 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Chuq Von Rospach of "Wed, 29 Oct 2003 13:11:01 PST." <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> Message-ID: <24751.1067464496@kanga.nu> On Wed, 29 Oct 2003 13:11:01 -0800 Chuq Von Rospach wrote: > On Oct 29, 2003, at 1:05 PM, David Birnbaum wrote: >> 2. third-party add-ons make it that much harder to install. If I >> have to set up a Mysql or Postgres database to use Mailman, it's a >> step that will put off people who don't already have it going. > actually, if you do it right, it's much easier -- because when you > build in those tools, you build in standardized interfaces that third > party add-ons can access, instead of the current case, which are code > hacks that break every time Barry burps at the CVS server... Aye, picking the right interface abstractions is key. There's also a disjoint between the novice SysAdm case who loves the fact of Mailman's all-in-one service, and the more meaty chap who integrates what he needs to. Much of Mailman's appeal at the low end is its all-in-one simple-to-install nature. (Well, ignoring thee GID FAQ...) Mailman v2.1 has a plugin layer for the membership roster. Its not a fully mature interface, but there are LDAP and SQL adaptors in the wild. At some point those adaptors will move into the Mailman core. If we move the archiving components (storage, presentation, index) behind plugin interfaces as well there's a reasonable opportunity for similar third parties to build adaptor layers which then also move into the Mailman core. Oh yeah, and just to keep Nigel Metheringham hopping: Mailman just doesn't have enough configuration options. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From spacey-mailman at lenin.nu Wed Oct 29 16:59:06 2003 From: spacey-mailman at lenin.nu (Peter C. Norton) Date: Wed Oct 29 16:59:10 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> Message-ID: <20031029215906.GM24088@lenin.nu> On Wed, Oct 29, 2003 at 10:14:52PM +0100, Brad Knowles wrote: > > I don't believe that there are any databases in existence that > "... can be made more suitable to the problem." > In theory you can add data types to postgresql. Not that I've done it myself, but its been done. -Peter -- The 5 year plan: In five years we'll make up another plan. Or just re-use this one. From claw at kanga.nu Wed Oct 29 17:10:40 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 17:11:28 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from "Peter C. Norton" of "Wed, 29 Oct 2003 13:59:06 PST." <20031029215906.GM24088@lenin.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> Message-ID: <11079.1067465440@kanga.nu> On Wed, 29 Oct 2003 13:59:06 -0800 Peter C Norton wrote: > On Wed, Oct 29, 2003 at 10:14:52PM +0100, Brad Knowles wrote: >> I don't believe that there are any databases in existence that >> "... can be made more suitable to the problem." > In theory you can add data types to postgresql. Not that I've done it > myself, but its been done. True, but that doesn't answer the question of whether an RDBMS is a good storage tool for messages. I spent a couple months of spare time last year building an archiving system I liked atop PostgresQL using fully decomposed SQL structures for all the message bits. It was not a pretty exercise, and the results were worse. Brad makes excellent points in his comments on poor BLOB support, the value if DBs for meta-data, and disaster recovery ease. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Wed Oct 29 17:33:19 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 17:33:30 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Brad Knowles of "Wed, 29 Oct 2003 20:11:49 +0100." References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <17914.1067319714@kanga.nu> <25767.1067446139@kanga.nu> <13832.1067452249@kanga.nu> Message-ID: <5479.1067466799@kanga.nu> On Wed, 29 Oct 2003 20:11:49 +0100 Brad Knowles wrote: > At 1:30 PM -0500 2003/10/29, J C Lawrence wrote: >> 2) Message IDs are not guaranteed globally unique, but the collision >> rate can be manageable/acceptable in a large number of deployment >> cases. > Outside of a database, this may be something you can decide whether or > not to live with. Within the confines of a database, this simply is > not possible. Of course, and that's the point. We are in violent agreement. > The ANSI SQL specification has some hard requirements for a primary > index key: I know, but that's not what I'm asserting. I'll also ignore the DB types which don't require primary keys of any form, as that's essentially what we have now and we're assuming an indexed store instead. >> We don't have to guarantee key uniqueness for all messages BEFORE >> they are submitted to the message store. > All other keys could potentially be non-unique, or null, but not > the primary index key. Ahh, I think I see the disjoint. We're using "key" in two contexts without distinguishing between them: 1) The property of a message which identifies that message with a high probability of uniqueness. This can be a Message ID, MD5SUM, whatever, but it is not guaranteed unique, it merely is unique most of the time for large definitions of "most". 2) The primary key as used in an indexed DB or other store which is guaranteed unique for all cases. Between the two there's a conflict. One requires perfect uniqueness. The other delivers merely a good Best Effort. The assertion is that we don't always have to solve that mismatch. We can elect to live with the collisions. > This is why many applications have the database assign the primary > index key itself on insertion into the table, so that all the > necessary requirements can be met. Sure, except that doing that in our case requires that storage be a synchronous operation (otherwise we don't know the key at rewrite/delivery time). That would a significant change from the current model and rather unfriendly to a wide range of deployment cases. Keeping the storage procedure asynchronous with an a-priori key (for whatever guarantee of uniqueness) makes for a more interesting system. >> I'm neither an idiot or a neophyte in this game. Yes, a database >> needs a primary unique key. > Then you must realize that we could not possibly use message-id as the > primary index key, unless this is a field that we generate ourselves > in such a way that all the necessary requirements are met. No, I don't realise that because it is false. We can use Message IDs as the primary key right now, today. In fact, I am, right now, this minute, today. You are assuming that every message submitted to the store must be accepted by the store. That is an assumption that hasn't been defined as a requirement and which some evidence suggests isn't a hard requirement. A very small percentage of the messages I submit to my store don't make it. They have duplicate Message IDs. They run through Mailman just fine. They never reach my list archives. I know, expect, accept this. The primary key has to be unique for every message IN THE STORE. Accepted. That does not dictate that the primary key for every message SUBMITTED to the store has to be unique (not that key assignment is occurring before collision check), or that the store has to ACCEPT every message which is submitted to it. Guaranteeing perfect uniqueness of the keys prior to submission to the store is fragile and expensive. It is tempting to do some form of very good approximation (cg Chuq's MD5SUM). Without perfect synchrony with the store's keys. if we calculate keys prior to insertion, or merely accept the keys that are given us in the form of Message IDs we're going to get occasional collisions. The question is how to handle messages whose a priori assigned keys collide with keys already in the store. We can handle the collision case in several ways: 1) Ignore it and discard messages bearing colliding keys. 2) Best Effort attempt to guarantee uniqueness within a window, with collisions outside the window discarded. 3) Fully guarantee uniqueness. The first is easy. The second is fairly easy. The third isn't trivial. In all three cases the population of key values in the store remains unique. Its just that the population of keys submitted to the store may or may not be unique. Lossage at the insertion layer can be acceptable. >> Let's at least be on the same page. > Agreed. Cool. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From spacey-mailman at lenin.nu Wed Oct 29 17:28:58 2003 From: spacey-mailman at lenin.nu (Peter C. Norton) Date: Wed Oct 29 18:52:25 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <11079.1067465440@kanga.nu> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> Message-ID: <20031029222858.GN24088@lenin.nu> On Wed, Oct 29, 2003 at 05:10:40PM -0500, J C Lawrence wrote: > > True, but that doesn't answer the question of whether an RDBMS is a good > storage tool for messages. I spent a couple months of spare time last > year building an archiving system I liked atop PostgresQL using fully > decomposed SQL structures for all the message bits. It was not a pretty > exercise, and the results were worse. Brad makes excellent points in > his comments on poor BLOB support, the value if DBs for meta-data, and > disaster recovery ease. I may not have made it clear, but I'm focusing on the metadata. Once you've parsed rfc822/2822, then it may become easier to have things in the database that can manipulate those types. I.e. to do be able to do simple searches for a property of given arbitrary headers (w/o having to have a database schema that consists of a few known headers and "others" which you then have to treat as a blob or as text). -Peter -- The 5 year plan: In five years we'll make up another plan. Or just re-use this one. From chuqui at plaidworks.com Wed Oct 29 19:12:50 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Wed Oct 29 19:12:56 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <20031029222858.GN24088@lenin.nu> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> Message-ID: On Oct 29, 2003, at 2:28 PM, Peter C. Norton wrote: > I may not have made it clear, but I'm focusing on the metadata. Once > you've parsed rfc822/2822, then it may become easier to have things in > the database that can manipulate those types. I.e. to do be able to > do simple searches for a property of given arbitrary headers (w/o > having to have a database schema that consists of a few known headers > and "others" which you then have to treat as a blob or as text). my only real worry is that from what I've seen, 99.99% of the time, the user is going to want content searches. header stuff is fine, but of really low priority in the scheme of things (necessary to put useful things together, meaningless if you can't content/context search in fulltext). that's why I'm leaning, blob issues or no, towards full-text storage in MySQL 4. Because if you can't easily chop up the message body content and find the messages you want to deal with, elegant storage of the headers is irrelevant... I think you need that, too. But until you get a reasonable context search for the message body, designing the rest is silly. And it seems to me there are few better methods than dumping the text into MySQL and letting it do the work. Compromises, tradeoffs and etc notwithstanding... From chuqui at plaidworks.com Wed Oct 29 19:40:53 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Wed Oct 29 19:40:59 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> Message-ID: by the way, this statement is in conflict with my previous statemenet of "use cycbufs". I'm fully aware of that conflict, too. resolving it will be one of the big challenges. On Oct 29, 2003, at 4:12 PM, Chuq Von Rospach wrote: > that's why I'm leaning, blob issues or no, towards full-text storage > in MySQL 4. Because if you can't easily chop up the message body > content and find the messages you want to deal with, elegant storage > of the headers is irrelevant... From claw at kanga.nu Wed Oct 29 21:16:20 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 21:16:27 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Chuq Von Rospach of "Wed, 29 Oct 2003 16:12:50 PST." References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> Message-ID: <27947.1067480180@kanga.nu> On Wed, 29 Oct 2003 16:12:50 -0800 Chuq Von Rospach wrote: > On Oct 29, 2003, at 2:28 PM, Peter C. Norton wrote: >> I may not have made it clear, but I'm focusing on the metadata. Once >> you've parsed rfc822/2822, then it may become easier to have things >> in the database that can manipulate those types. I.e. to do be able >> to do simple searches for a property of given arbitrary headers (w/o >> having to have a database schema that consists of a few known headers >> and "others" which you then have to treat as a blob or as text). > my only real worry is that from what I've seen, 99.99% of the time, > the user is going to want content searches. header stuff is fine, but > of really low priority in the scheme of things (necessary to put > useful things together, meaningless if you can't content/context > search in fulltext). I see two needs, for significantly different populations. The first wants a browsing interface with keyed and indexed by date, thread, and author. The second wands full text search with rapid location and retrieval of matching messages. Often a single user will move between the access methods, reading by thread, bouncing over to a search, then reading all an author has written that match, then searching again, etc. As such two distinct sets of indexes seem called for: full text and message meta-data. > that's why I'm leaning, blob issues or no, towards full-text storage > in MySQL 4. Because if you can't easily chop up the message body > content and find the messages you want to deal with, elegant storage > of the headers is irrelevant... True. However, but this seems to conflate two distinct problems. If you're going to do unindexed searches then this makes sense, however except for minimal cases that's an interesting space. It scales like crap and has an even worse feature set. It is more interesting to split storage and indexing into distinct solution designs, and to build or pick something tailored for that smaller problem. That way you don't do full text searching, you do full text indexing and then search the indexes. > I think you need that, too. But until you get a reasonable context > search for the message body, designing the rest is silly. Is searching message bodies really interesting, or is building indexes of message bodies such that you can later search those indexes the actually interesting point? > And it seems to me there are few better methods than dumping the text > into MySQL and letting it do the work. Compromises, tradeoffs and etc > notwithstanding... How does MySQL help you in building language-sensitive rapid response indexes of large text blobs? -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From brad.knowles at skynet.be Wed Oct 29 20:52:52 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 21:21:04 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> Message-ID: At 4:12 PM -0800 2003/10/29, Chuq Von Rospach wrote: > that's why I'm leaning, blob issues or no, towards full-text storage > in MySQL 4. Because if you can't easily chop up the message body > content and find the messages you want to deal with, elegant storage > of the headers is irrelevant... I think you could do full word indexing per message, and then store that index information in the database. Searching for phrases would require hitting the message bodies themselves, but searching for individual words could be done on indexed fields. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From claw at kanga.nu Wed Oct 29 21:22:32 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 21:22:37 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Chuq Von Rospach of "Wed, 29 Oct 2003 16:40:53 PST." References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> Message-ID: <2267.1067480552@kanga.nu> On Wed, 29 Oct 2003 16:40:53 -0800 Chuq Von Rospach wrote: > by the way, this statement is in conflict with my previous statemenet > of "use cycbufs". I'm fully aware of that conflict, too. resolving it > will be one of the big challenges. cycbufs implement a filesystem-based heap with pool semantics. (There's a fair bit of literature on that space in the OS and application realm) As such they are specifically tuned for the case where the number of calls to malloc() are of a similar magnitude to the calls to free(). This makes sense in a netnews world where news articles expire regularly, and in general as much data is added to the spool as is removed from it. Does that model really apply to list archives? It doesn't for me. I may be unusual in this regard, but I generally consider list archives as one-way systems: messages go in and never come out. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Wed Oct 29 21:26:44 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 21:28:34 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Brad Knowles of "Thu, 30 Oct 2003 02:52:52 +0100." References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> Message-ID: <6974.1067480804@kanga.nu> On Thu, 30 Oct 2003 02:52:52 +0100 Brad Knowles wrote: > I think you could do full word indexing per message, and then store > that index information in the database. Searching for phrases would > require hitting the message bodies themselves, but searching for > individual words could be done on indexed fields. Consider an index which records not just the fact of a token's presence in an entity, but also the offsets at which it occurs within the entity. Searching for phrases then consists of searching for objects which satisfy the boolean "X AND Y", as well as the smaller clause "offset(X) + length (X) + 1|2 == offset (Y)". Larger phrases extend the equivalence language linearly, tho they create exponential search costs. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From chuqui at plaidworks.com Wed Oct 29 22:00:28 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Wed Oct 29 22:01:23 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> Message-ID: <374F680A-0A85-11D8-9559-0003934516A8@plaidworks.com> On Oct 29, 2003, at 5:52 PM, Brad Knowles wrote: > I think you could do full word indexing per message, and then store > that index information in the database. Searching for phrases would > require hitting the message bodies themselves, but searching for > individual words could be done on indexed fields. > you could, but is it worth doing it yourself when MySQL is building it for you? http://www.mysql.com/doc/en/Fulltext_Search.html http://jeremy.zawodny.com/blog/archives/000576.html http://www.zend.com/zend/tut/tutorial-ferrara1.php If you were just storing into a TEXT and then doing SELECT LIKE into it, I'd agree with you. But MySQL is doing interesting things here. Why not leverage it? From chuqui at plaidworks.com Wed Oct 29 22:02:16 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Wed Oct 29 22:03:25 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <27947.1067480180@kanga.nu> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <27947.1067480180@kanga.nu> Message-ID: <778CF735-0A85-11D8-9559-0003934516A8@plaidworks.com> On Oct 29, 2003, at 6:16 PM, J C Lawrence wrote: > I see two needs, for significantly different populations. The first > wants a browsing interface with keyed and indexed by date, thread, and > author. The second wands full text search with rapid location and > retrieval of matching messages. Often a single user will move between > the access methods, reading by thread, bouncing over to a search, then > reading all an author has written that match, then searching again, > etc. > As such two distinct sets of indexes seem called for: full text and > message meta-data. > >> I think you need that, too. But until you get a reasonable context >> search for the message body, designing the rest is silly. > > Is searching message bodies really interesting, or is building indexes > of message bodies such that you can later search those indexes the > actually interesting point? You're basically asking "why do you need google when you have yahoo?" ask the folks who depend on google. (and yes, I'm oversimplifying to make a point). > How does MySQL help you in building language-sensitive rapid response > indexes of large text blobs? > just posted a bunch of links. From chuqui at plaidworks.com Wed Oct 29 22:06:31 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Wed Oct 29 22:07:29 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <2267.1067480552@kanga.nu> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> Message-ID: <0FAEFA89-0A86-11D8-9559-0003934516A8@plaidworks.com> On Oct 29, 2003, at 6:22 PM, J C Lawrence wrote: > cycbufs implement a filesystem-based heap with pool semantics. > (There's > a fair bit of literature on that space in the OS and application realm) > As such they are specifically tuned for the case where the number of > calls to malloc() are of a similar magnitude to the calls to free(). > This makes sense in a netnews world where news articles expire > regularly, and in general as much data is added to the spool as is > removed from it. > > Does that model really apply to list archives? It doesn't for me. I > may be unusual in this regard, but I generally consider list archives > as > one-way systems: messages go in and never come out. > and in general, you're mostly right. Deletions out of archives are pretty minimal. But I think cycbufs still make a lot of sense as a way to reduce design complexity needed to avoid using up potentially infinite numbers of inodes, and the performance and design complexity inherent in building a storage structure around a typical unix filesystem. It's just so much less hassle on any number of levels dealing with 50 100 megabyte files than it is a directory structure with 500 megabytes of messages spread around 100,000 individual files. whether it's backups and restores, migrating data to a new server, etc, etc etc, you make life much simpler. And god help you if you're updating that structure when the system crashes and you have to fsck and put it back together again. From brad.knowles at skynet.be Wed Oct 29 22:08:45 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 22:09:02 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <2267.1067480552@kanga.nu> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> Message-ID: At 9:22 PM -0500 2003/10/29, J C Lawrence wrote: > cycbufs implement a filesystem-based heap with pool semantics. (There's > a fair bit of literature on that space in the OS and application realm) > As such they are specifically tuned for the case where the number of > calls to malloc() are of a similar magnitude to the calls to free(). > This makes sense in a netnews world where news articles expire > regularly, and in general as much data is added to the spool as is > removed from it. So long as the calls to malloc() are kept reasonably small (which is typically true in this case), it shouldn't matter whether or not there are any free() calls. Yes, you slowly build up more disk space in utilization, but all archive solutions will have the same problem, and this solution will scale as well as, or better than, any other that I know of. Consider the case where you are trying to store all news articles that have ever been posted -- not really much difference. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From claw at kanga.nu Wed Oct 29 22:27:50 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 22:28:07 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Brad Knowles of "Thu, 30 Oct 2003 04:08:45 +0100." References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> Message-ID: <10134.1067484470@kanga.nu> On Thu, 30 Oct 2003 04:08:45 +0100 Brad Knowles wrote: > At 9:22 PM -0500 2003/10/29, J C Lawrence wrote: >> cycbufs implement a filesystem-based heap with pool semantics. >> (There's a fair bit of literature on that space in the OS and >> application realm) As such they are specifically tuned for the case >> where the number of calls to malloc() are of a similar magnitude to >> the calls to free(). This makes sense in a netnews world where news >> articles expire regularly, and in general as much data is added to >> the spool as is removed from it. > So long as the calls to malloc() are kept reasonably small (which is > typically true in this case), it shouldn't matter whether or not there > are any free() calls. I've written several heap managers including several pool based systems as well as other sorts of custom allocators. There are a great many simplifications that come along with the write-once approach, especially in terms of the trade-offs between allocation expense and free space management. > Yes, you slowly build up more disk space in utilization, but all > archive solutions will have the same problem, and this solution will > scale as well as, or better than, any other that I know of. Which is not exactly my point. cycbufs are a useful technique to be sure, much as Chuq has discussed from a management perspective. My point is more that I don't see that they add anything essentially different to the storage space in terms of storage semantics. You get a higher rate of file handle re-use, a more friendly filesystem behaviour for older filesystem designs (pleasant optimisations), but exactly the same single key -> byte stream without adding any more interesting verbs of transforms to the solution space. This is not a Bad Thing, just not something that seems applicable at this state in the design discussion. First come ontology and semantics, then comes implementation. > Consider the case where you are trying to store all news articles that > have ever been posted -- not really much difference. Actually the two cases are considerably different. In the delete case I have to do pool management, with some eye toward fragmentation control and optimisations of average latency for free heap searches, as well as heap integrity audits. In the write-only case I just build on the end and need pay no mind to prior data once it is allocated. In both cases I have to do predictive work on the distribution of allocation sizes, but that's far cheaper in the write-only case as the multiple-pool search overhead can be entirely skipped. There's a considerable difference in complexity between the two. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From brad.knowles at skynet.be Wed Oct 29 22:37:32 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 22:40:10 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <374F680A-0A85-11D8-9559-0003934516A8@plaidworks.com> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <374F680A-0A85-11D8-9559-0003934516A8@plaidworks.com> Message-ID: At 7:00 PM -0800 2003/10/29, Chuq Von Rospach wrote: > you could, but is it worth doing it yourself when MySQL is building > it for you? > > http://www.mysql.com/doc/en/Fulltext_Search.html From the top of this page: 6.8 MySQL Full-text Search As of Version 3.23.23, MySQL has support for full-text indexing and searching. Full-text indexes in MySQL are an index of type FULLTEXT. FULLTEXT indexes are used with MyISAM tables only and can be created from CHAR, VARCHAR, or TEXT columns at CREATE TABLE time or added later with ALTER TABLE or CREATE INDEX. For large datasets, it will be much faster to load your data into a table that has no FULLTEXT index, then create the index with ALTER TABLE (or CREATE INDEX). Loading data into a table that already has a FULLTEXT index could be significantly slower. Moreover, mail messages will be a undetermined variable length. Can MySQL support a 32-bit VARCHAR? What about type TEXT? Or 8-bit or even 16-bit character sets? Since you might be storing a lot of MIME bodypart types, can it handle BLOBs, and can it handle them well? Or, do you do parsing within your archive application and store the entire message somewhere outside of the database, while storing a FULLTEXT index of only the bodypart types you declare to be human-readable? What if you want to do a case-sensitive search? In that case, it doesn't look like FULLTEXT or MATCH will do you any good, since MATCH is declared to be case-insensitive. Or what if you want to search for hyphenated literals? It seems that MATCH considers them to be word breaks even within literal searches. > If you were just storing into a TEXT and then doing SELECT LIKE into it, > I'd agree with you. But MySQL is doing interesting things here. Why not > leverage it? I'm not sure it really helps in this case. I'm not sure it can handle the amounts of data that might need to be stored into a field, or the different character sets that might need to be used. I'm also concerned about what using this function might do to the overall speed and size of the database. On the page quoted above, look for benchmark data reported by Jim Nguyen and John Takacs. Two million rows with text and multiple word searches (three or more) taking 30-seconds to a minute to complete, is not good performance. Three to five million rows, with searches taking 50 seconds or more for single words, is not good performance. Now, consider how many words might be in a single message (hundreds to thousands or even tens of thousands), and how many messages might be in a single archive (thousands to millions). If each message was contained within a row, this would be dead-Universe slow. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From barry at python.org Wed Oct 29 22:43:05 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 29 22:43:12 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> Message-ID: <1067485384.5295.11.camel@anthem> On Wed, 2003-10-29 at 13:45, Brad Knowles wrote: > That said, storing meta-data in a real database and then using > external filesystem techniques for actually accessing the data, > should give you the best of both worlds -- the speed of access of the > database, and the reliability and well-understood access and backup > mechanisms of filesystems. I'm strongly in favor of this kind of approach. I don't know what the best on-disk storage format is (although cycbuf sounds interesting), but I'm pretty sure we want the raw messages stored as plain files on the file system. We may even want both the encoded and decoded messages stored on the file system -- at the very least, we should have attachments decoded and stored in separate files. Then we want metadata about the messages stored in a database. We should be able to regenerate or update the metadata by trolling over the raw message storage, and we should be able to vend messages from the message store via any number of protocols. The message store should be a central component of Mailman, but it should be defined by an interface in case we decide to change the implementation of the message store. -Barry From brad.knowles at skynet.be Wed Oct 29 22:45:37 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 22:46:01 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <10134.1067484470@kanga.nu> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <10134.1067484470@kanga.nu> Message-ID: At 10:27 PM -0500 2003/10/29, J C Lawrence wrote: > Actually the two cases are considerably different. In the delete case I > have to do pool management, with some eye toward fragmentation control > and optimisations of average latency for free heap searches, as well as > heap integrity audits. In the write-only case I just build on the end > and need pay no mind to prior data once it is allocated. Not really. You still have to maintain all the indexes, make sure that if things get moved around that all the links get updated, etc.... True, you don't have to worry about fragementation control or other more complex aspects of heap management, but that's a further cost savings over other techniques and not a "drawback" to using this technique for this purpose. Now, if you want to consider what would happen to you if the Scientologists ever came after you, or if you had court orders to remove postings that linked to bomb-making instructions, you'd probably want to keep all those other tools related to heap management around anyway. They'd be less likely to be used, but at least you wouldn't have to take the entire site down while you went and wrote the tools from scratch to handle a situation that you had not foreseen. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From barry at python.org Wed Oct 29 22:47:59 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 29 22:48:09 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> Message-ID: <1067485679.5295.17.camel@anthem> On Wed, 2003-10-29 at 14:38, Chuq Von Rospach wrote: > Hint: look at what INN did when they implmented cycbufs. > > Effectively, you create 1-N files, or create files as needed. Each file > is N bytes long, pre-allocated on file creation. When you store > messages, they're written into the file sequentially (or any other way > you want. If you want to get into best fit allocations and turn this > into a malloc() style heap, be my guest). > > Metadata to access the info is then a filename, and an lseek() pointer > into the file, and # of bytes to read, plus your normal identifying > info. It's fast, it's efficient use of file pointers, it avoids the > worst aspects of the unix file system, and I'm amazed nobody ever > thinks to use it for other purposes (or that it took that long for > usenet people to discover it, I suggested a simpler variant of it back > in the 80s and was told inodes are our friends...) I'm not sure if Andrew Koenig is on this list, but he described an algorithm he developed to quickly find messages in an mbox file. If he's here, maybe he can talk about it. I really don't like mbox files, primarily because they require munging >From lines in the body of the message. MMDF would be better, but I think ideal from a philosophical point of view would be one-message-per-file if it can be done efficiently cross-platform. Maybe file system experts here can provide pointers or advice on exactly which file and operating systems make this approach feasible, even for huge message counts. > you can even do expiration/purge/etc if you want, by moving stuff > around and changing the pointers. > > I've even thought of using it as the backing store for a picture > library. With a nice relational database and a series of these "data > boxes", I think you have store data in the best and fastest possible > way... It's a very interesting idea. -Barry From brad.knowles at skynet.be Wed Oct 29 22:50:04 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 22:52:15 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067485384.5295.11.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <1067485384.5295.11.camel@anthem> Message-ID: At 10:43 PM -0500 2003/10/29, Barry Warsaw wrote: > We may even want both the encoded and decoded messages stored on the > file system -- at the very least, we should have attachments decoded and > stored in separate files. I'm not at all sure that you want to go down this route. One of the biggest headaches within the AOL mail system is the handling of attachment storage, what happens when the attachments get out of sync, etc.... Same with Eudora as a local MUA. I think I'd be inclined to store the message in wire format just the once, and deal with transformation on the fly. At least you would never have to worry about the attachments getting out of sync with the message bodies, and maybe someone else getting the attachments they weren't supposed to see, or not getting the attachments at all that they should have, etc.... -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From barry at python.org Wed Oct 29 22:52:47 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 29 22:52:56 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <3147.1067460073@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> Message-ID: <1067485966.5295.23.camel@anthem> On Wed, 2003-10-29 at 15:41, J C Lawrence wrote: > Some years back I talked to Mike Belshe (used to be at Remarq) about > their storage techniques (I caught him shortly after Critical Path > bought Remarq). Keying off other LISA papers they segmented their > storage space by object size, customising and configuring each segment > to suit (things like RAID strip size, number of spindles, FS tuning > parameters, etc). He asserted that the rewards were very significant. > > However, these are very large archive problems and are a bit outside of > Mailman's home turf. Mailman's philosophy is, keep it as simple as possible to handle 80% of the installations out there, but provide enough framework for the other 20% to extend for extreme uses. Strategies to accomplish this include defining interfaces to key components, and shipping something that works out of the box and is good enough for most people. It's not always easy, of course, to architect something that scales this way. I think we have a pretty good idea of the scaling problems with Mailman 2, and I hope we can push the envelop significantly for Mailman 3. -Barry From brad.knowles at skynet.be Wed Oct 29 23:00:48 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 23:01:03 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067485679.5295.17.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <1067485679.5295.17.camel@anthem> Message-ID: At 10:47 PM -0500 2003/10/29, Barry Warsaw wrote: > I'm not sure if Andrew Koenig is on this list, but he described an > algorithm he developed to quickly find messages in an mbox file. If > he's here, maybe he can talk about it. 7th edition mbox files are a pain. There are other mailbox file formats that are much better and easier to parse (UW-IMAP .mbx being one). > I really don't like mbox files, primarily because they require munging > From lines in the body of the message. MMDF would be better, but I > think ideal from a philosophical point of view would be > one-message-per-file if it can be done efficiently cross-platform. Therein lies the problem. Some filesystems make this more feasible than others, at least on larger scale systems. > Maybe file system experts here can provide pointers or advice on exactly > which file and operating systems make this approach feasible, even for > huge message counts. SGIs XFS on Irix does a pretty good job, with hashed directory structures, and an extent-based journaling filesystem. Regretfully, I don't think that all of these features are fully supported under the Linux version of XFS, and that work has basically ground to a halt with the lay-offs of all the key SGI people who had been working on XFS. Veritas VxFS also does a good job in this area. Other than SGI XFS for Irix and Veritas VxFS, I don't know of any good solutions to this problem at the filesystem level. Kirk McKusick and Eric Allman agree with you that this is a proper filesystem problem that should be solved at the filesystem level (at least, that's what they've said to me when I brought this issue up to them), and they feel you should not attempt to solve filesystem problems with "tricks" like INN timecaf/timehash cycbufs. However, while that's nice in theory, that doesn't necessarily help us here in the real world. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From claw at kanga.nu Wed Oct 29 23:01:01 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 23:01:08 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Brad Knowles of "Thu, 30 Oct 2003 04:45:37 +0100." References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <10134.1067484470@kanga.nu> Message-ID: <12670.1067486461@kanga.nu> On Thu, 30 Oct 2003 04:45:37 +0100 Brad Knowles wrote: > At 10:27 PM -0500 2003/10/29, J C Lawrence wrote: >> Actually the two cases are considerably different. In the delete >> case I have to do pool management, with some eye toward fragmentation >> control and optimisations of average latency for free heap searches, >> as well as heap integrity audits. In the write-only case I just >> build on the end and need pay no mind to prior data once it is >> allocated. > Not really. You still have to maintain all the indexes, make sure > that if things get moved around that all the links get updated, > etc.... With a write-once system you don't actually need to ever move anything. At its core it is: Open one file, repetitively append to end until file size exceeds size N, create new file, repeat. You can do object size clustering across files or other optimisation techniques, but the basic pattern remains the same. For the few cases you have to support delete you either just NULL the byte stream for the pointed-to object, or you invalidate the key. As the frequency and number of such deletes is infinitesimal, they require no special management complexity. You can afford to just swallow the lost free space as the cost of attempting to manage it is simply never rewarded. > True, you don't have to worry about fragementation control or other > more complex aspects of heap management, but that's a further cost > savings over other techniques and not a "drawback" to using this > technique for this purpose. True. I'm not lableing it a drawback, just a boon of dubious advantage. > Now, if you want to consider what would happen to you if the > Scientologists ever came after you, or if you had court orders to > remove postings that linked to bomb-making instructions, you'd > probably want to keep all those other tools related to heap management > around anyway. Not really. The percentage of such deleted posts over the lifetime of the store can be generally assumed to be less than 1 in 10^5, and is probably considerably lower, if not in the 1:10^8 range. Add a simple invalid key semantic and you're done. Caveat: Continual addition and deletion of SPAM from an archive would change this balance. > They'd be less likely to be used, but at least you wouldn't have to > take the entire site down while you went and wrote the tools from > scratch to handle a situation that you had not foreseen. You're going to need tools when the percentage of such deleted postings is sufficiently high that the cost of the lost free space and its overhead exceeds the cost of managing that free space. That's not a quick thing. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From barry at python.org Wed Oct 29 23:01:14 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 29 23:01:24 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <24751.1067464496@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> Message-ID: <1067486474.5295.32.camel@anthem> On Wed, 2003-10-29 at 16:54, J C Lawrence wrote: > Aye, picking the right interface abstractions is key. Right on. > There's also a disjoint between the novice SysAdm case who loves the > fact of Mailman's all-in-one service, and the more meaty chap who > integrates what he needs to. Much of Mailman's appeal at the low end is > its all-in-one simple-to-install nature. (Well, ignoring thee GID > FAQ...) Yep, and I really really want Mailman 3 to take this concept farther. Some things that I think will help include, using Twisted to eliminate the /requirement/ of Apache integration and possibly the incoming mail server integration, as well as implement a bulk mailer to eliminate the need for an outgoing mail server. Ideally, it will still be possible to integrate with a Postfix for incoming and outgoing, but it shouldn't be necessary to get up and running. > Mailman v2.1 has a plugin layer for the membership roster. Its not a > fully mature interface, but there are LDAP and SQL adaptors in the wild. This interface was largely bolted on, so it's clumsy. Mailman 3 will be defined by interfaces from the start. > At some point those adaptors will move into the Mailman core. If we > move the archiving components (storage, presentation, index) behind > plugin interfaces as well there's a reasonable opportunity for similar > third parties to build adaptor layers which then also move into the > Mailman core. > > Oh yeah, and just to keep Nigel Metheringham hopping: > > Mailman just doesn't have enough configuration options. Heh. That's another issue. I'm sure Mailman 3 will grow many more configuration options. The trick is making them manageable (and mostly ignorable -- i.e. the defaults Usually Work out of the box). I've been experimenting with ideas for list styles which will make list admins lives easier I think, without reducing the flexibility for experts. -Barry From chuqui at plaidworks.com Wed Oct 29 23:14:30 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Wed Oct 29 23:17:13 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <1067485679.5295.17.camel@anthem> Message-ID: <8F2FD9BE-0A8F-11D8-9559-0003934516A8@plaidworks.com> And since Barry's underlying philosophy is to minimize the "number of things Mailman depends on", that sort of lets out depending on them having an OS with a high-performance journaling filesystem, no? (giggle) On Oct 29, 2003, at 8:00 PM, Brad Knowles wrote: > However, while that's nice in theory, that doesn't necessarily help > us here in the real world. From claw at kanga.nu Wed Oct 29 23:17:32 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 23:17:40 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Barry Warsaw of "Wed, 29 Oct 2003 23:01:14 EST." <1067486474.5295.32.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> Message-ID: <13955.1067487452@kanga.nu> On Wed, 29 Oct 2003 23:01:14 -0500 Barry Warsaw wrote: > On Wed, 2003-10-29 at 16:54, J C Lawrence wrote: >> Aye, picking the right interface abstractions is key. > Right on. I'm still debating if I can run down there on the 8th. I'd love to go to EuroQuest, but I also really need to be in Providence on the 7th, and back at work on the 9th. Aaaarrrgh. _IF_ I can make it we must go hit a pub with whiteboards in hand. Sorry for no earlier reply on this BTW, I'm in drowning eyeballs mode. > ...as well as implement a bulk mailer to eliminate the need for an > outgoing mail server. Eeeek! I trust this would be for immediate handoff to a "real" MTA versus handling final delivery directly? Quite the Pandora's box if not. >> Mailman v2.1 has a plugin layer for the membership roster. Its not a >> fully mature interface, but there are LDAP and SQL adaptors in the >> wild. > This interface was largely bolted on, so it's clumsy. Mailman 3 will > be defined by interfaces from the start. BTW Whatever happened to Michel Pelletier's interfaces PEP? I see the draft, and I see signs that something got done, but not what... >> Oh yeah, and just to keep Nigel Metheringham hopping: >> Mailman just doesn't have enough configuration options. > Heh. That's another issue. Last I heard Nigel was still running screaming into the hills. > I'm sure Mailman 3 will grow many more configuration options. The > trick is making them manageable (and mostly ignorable -- i.e. the > defaults Usually Work out of the box). > I've been experimenting with ideas for list styles which will make > list admins lives easier I think, without reducing the flexibility for > experts. Aye, that's something the Plone folk have been digging at with some success: a base library of waffle-stomp configuration patterns. I'm not sure for Mailman if we want just a picklist, or a very simple wizard. I suspect something more akin to the very brief Q&A wizard at Creative Commons for choosing a license type may be more effective and interesting than a picklist: http://creativecommons.org/license/ Very simple, very general, covers the basic cases, hides all the ugly stuff and picks sane defaults. It becomes even more interesting if site admins can tailor the configs for the basic cases. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From barry at python.org Wed Oct 29 23:18:13 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 29 23:18:20 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> Message-ID: <1067487492.5295.39.camel@anthem> On Wed, 2003-10-29 at 16:14, Brad Knowles wrote: > One key factor here is that all of the information in the > database should be able to be re-created from the message bodies > alone, if there should happen to be a catastrophic system crash. Just to be dense, let me ask for clarification: by "message body" you mean the entire original message, as received on the wire, not just the message payload (i.e. sans RFC 2822 headers). If so, I agree completely. But I also think the decoded message should be stored on the file system somehow as well. I.e. decode attachments and store then as separate files too. -Barry From barry at python.org Wed Oct 29 23:23:18 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 29 23:23:25 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <20031029215906.GM24088@lenin.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> Message-ID: <1067487797.5295.45.camel@anthem> On Wed, 2003-10-29 at 16:59, Peter C. Norton wrote: > In theory you can add data types to postgresql. Not that I've done it > myself, but its been done. I wouldn't want to build a system that required PostgreSQL. Maybe we can hide all the gore behind an interface, maybe not. After all, we're using Python here (that's not going to change) so speed is all relative. Let's analyze what we're going to use this message store for too. We should be able to come up with a fast-enough solution for most sites. Really huge sites are probably going to cook their own dog food and won't even look at Mailman. There should be enough flexibility in the framework to allow the sites in the middle to scale Mailman up with some extra effort. Note: I don't think the message store gets in the picture for the qrunners. We can probably improve performance here, but that's an entirely different problem than the long-term message store we're talking about. -Barry From brad.knowles at skynet.be Wed Oct 29 23:15:58 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 23:27:00 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <12670.1067486461@kanga.nu> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <10134.1067484470@kanga.nu> <12670.1067486461@kanga.nu> Message-ID: At 11:01 PM -0500 2003/10/29, J C Lawrence wrote: > With a write-once system you don't actually need to ever move anything. Depends on how you manage the storage of those large files. If you have an infinitely large filesystem that is guaranteed 100% reliable in all possible circumstances, you're right. Otherwise, you might find that the filesystem is getting full and things need to be moved around, or you suffer a disk or storage system crash and you have to restore from backups, or you use an HSM solution to move older files to slower/higher capacity storage, or you have issues with too many large files in a single directory and need to implement your own directory hashing scheme, etc.... > Not really. The percentage of such deleted posts over the lifetime of > the store can be generally assumed to be less than 1 in 10^5, and is > probably considerably lower, if not in the 1:10^8 range. Add a simple > invalid key semantic and you're done. It depends on whether or not the court order allows you to just mark things as "deleted" and be done with it. If they force you to actually expunge all copies of that data from your systems, you will have to do more work. > You're going to need tools when the percentage of such deleted postings > is sufficiently high that the cost of the lost free space and its > overhead exceeds the cost of managing that free space. That's not a > quick thing. True enough, but as you've pointed out, there have been a number of implementations of this sort of solution, and you've worked on at least a couple yourself. These sorts of tools should already be reasonably well understood and not too difficult to write or "borrow" from other sources. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From brad.knowles at skynet.be Wed Oct 29 23:26:40 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 23:27:09 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067486474.5295.32.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> Message-ID: At 11:01 PM -0500 2003/10/29, Barry Warsaw wrote: > Yep, and I really really want Mailman 3 to take this concept farther. > Some things that I think will help include, using Twisted to eliminate > the /requirement/ of Apache integration and possibly the incoming mail > server integration, as well as implement a bulk mailer to eliminate the > need for an outgoing mail server. There, I have to disagree. Both the web server and the mail server issues are complex enough that I don't believe it would be a good idea to try and re-invent this wheel. There are already enough bad web server and mail server implementations out there -- we don't need to make this situation worse. There may be some mailing-list specific issues that we can (and should) handle better inside mailman before we hand these things off to the other servers, but both Apache and postfix/sendmail/exim have enough experience and world-wide testing behind them to make it little else than folly resulting from hubris to try and replace them. There's just no substitute for having hundreds of millions of people world-wide pounding on these things day-in and day-out 365 days a year. Components like this should be scheduled for replacement if, and only if, you can demonstrate beyond a reasonable doubt that there are inherent problems that are insurmountable otherwise, and there is no feasible alternative. You don't just take a Tom Mix pocket knife and cut open your own chest and remove your heart, to replace it with a mechanical pump that you designed yourself out of a tin can, a turkey baster, some bailing wire, and some garden hose. If you absolutely require a heart transplant and there are no human alternatives, you get a world-respected heart surgeon to perform the operation using the latest techniques and the Jaarvik 9 (or whatever). And then you get everyone in your family, all your friends, all your neighbors, all your church members, and hopefully all religious people world-wide to pray for you. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From barry at python.org Wed Oct 29 23:27:46 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 29 23:27:55 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <0FAEFA89-0A86-11D8-9559-0003934516A8@plaidworks.com> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <0FAEFA89-0A86-11D8-9559-0003934516A8@plaidworks.com> Message-ID: <1067488066.5295.47.camel@anthem> On Wed, 2003-10-29 at 22:06, Chuq Von Rospach wrote: > It's just so much less hassle on any number of levels dealing with 50 > 100 megabyte files than it is a directory structure with 500 megabytes > of messages spread around 100,000 individual files. whether it's > backups and restores, migrating data to a new server, etc, etc etc, you > make life much simpler. And god help you if you're updating that > structure when the system crashes and you have to fsck and put it back > together again. We should just throw everything into a ZODB FileStorage Data.fs file, and let it grow to gigs in size <1/2 wink>. -Barry From brad.knowles at skynet.be Wed Oct 29 23:32:55 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 23:35:19 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067487492.5295.39.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <1067487492.5295.39.camel@anthem> Message-ID: At 11:18 PM -0500 2003/10/29, Barry Warsaw wrote: > Just to be dense, let me ask for clarification: by "message body" you > mean the entire original message, as received on the wire, not just the > message payload (i.e. sans RFC 2822 headers). If so, I agree > completely. Yes, you are correct. At issue is that there might be some headers which some users might wish to search on (or maybe just see) which might not be put into one or more of the fields, and you don't want to take the risk of losing those by assuming that you can always re-generate all the headers from what you've stored inside the database. > But I also think the decoded message should be stored on the file system > somehow as well. I.e. decode attachments and store then as separate > files too. My experience is that this is a bad idea. However, if the implementation is fully modularized at the API level, then we can always rip out the mailman solution and instead put in something that actually works. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From spacey-mailman at lenin.nu Wed Oct 29 23:35:14 2003 From: spacey-mailman at lenin.nu (Peter C. Norton) Date: Wed Oct 29 23:35:24 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <1067485679.5295.17.camel@anthem> Message-ID: <20031030043514.GP24088@lenin.nu> On Thu, Oct 30, 2003 at 05:00:48AM +0100, Brad Knowles wrote: > SGIs XFS on Irix does a pretty good job, with hashed directory > structures, and an extent-based journaling filesystem. Regretfully, > I don't think that all of these features are fully supported under > the Linux version of XFS, and that work has basically ground to a > halt with the lay-offs of all the key SGI people who had been working > on XFS. Veritas VxFS also does a good job in this area. [ A cursory google search indicates that hashed dirs, extents, and journalling are all in linux xfs. I can't imagine an unsupported feature making its way into the filesystem that SGI is putting on its latest and greatest systems, but if you know about this, please share ] In the case of a one-file-per-message approach, my experience with vxfs is that it creates a rather slow filesystem when you get your filesystem to the point of haing with a few hundred thousand small files (lots of wasted space in the extents and I believe, though I may be wrong, that there were lots of metadata lookups through multiple layers of indirections slowing things down). However reiserfs was built to handle a mix of lots of small files, ala maildir or mh spools. I'm not too current on current bsd going-ons, but I'd bet that ffs2 has something to offer in this arena, too, since it looks like it almost does extent-based allocation now. > Kirk McKusick and Eric Allman agree with you that this is a > proper filesystem problem that should be solved at the filesystem > level (at least, that's what they've said to me when I brought this > issue up to them), and they feel you should not attempt to solve > filesystem problems with "tricks" like INN timecaf/timehash cycbufs. Err... then to relate this to a prior post, why not just use maildirs on filesystems that are engineered to handle that sort of thing? > However, while that's nice in theory, that doesn't necessarily > help us here in the real world. Unless you are using a filesystem that works for this, right? Like xfs, vxfs, reiserfs, and probably ffs2. I believe that linux's ext3 has support for hashing directories (or soon will - I don't precisely know as I've been focusing on other things) -Peter -- The 5 year plan: In five years we'll make up another plan. Or just re-use this one. From chuqui at plaidworks.com Wed Oct 29 23:36:08 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Wed Oct 29 23:37:27 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> Message-ID: <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> On Oct 29, 2003, at 8:26 PM, Brad Knowles wrote: > > There may be some mailing-list specific issues that we can (and > should) handle better inside mailman before we hand these things off > to the other servers, but both Apache and postfix/sendmail/exim have > enough experience and world-wide testing behind them to make it little > else than folly resulting from hubris to try and replace them. > +1 I've experimented with direct-out-the-pipe delivery systems. Trust me, you don't want to go there. It's not trivial. Well, it's trivial for 90% of the world that follows the RFCs and behaves as expected and has the right DNS setups and isn't trying to outsmart spammers by being stupid. and you'll spend the other 90% of your time trying to build compatibility in with the other 10%. From chuqui at plaidworks.com Wed Oct 29 23:37:49 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Wed Oct 29 23:38:32 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067488066.5295.47.camel@anthem> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <0FAEFA89-0A86-11D8-9559-0003934516A8@plaidworks.com> <1067488066.5295.47.camel@anthem> Message-ID: On Oct 29, 2003, at 8:27 PM, Barry Warsaw wrote: > We should just throw everything into a ZODB FileStorage Data.fs file, > and let it grow to gigs in size <1/2 wink>. > until you have to split it across two disks because one is full. and don't forget, a single monolithic storage file gets backed up fully every time you change it. The guy in charge of buying tapes to back up your system just screamed in agony, since there's no possibility of an incremental backup for what is 99.9999999% static data. From chuqui at plaidworks.com Wed Oct 29 23:40:02 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Wed Oct 29 23:40:39 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <20031030043514.GP24088@lenin.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <1067485679.5295.17.camel@anthem> <20031030043514.GP24088@lenin.nu> Message-ID: <20759E24-0A93-11D8-A49B-0003934516A8@plaidworks.com> And windows? And older hardware? Solaris 8? Hell, solaris 6 and 7? You going to depend on people only running year-old-or-less hardware and OS? On Oct 29, 2003, at 8:35 PM, Peter C. Norton wrote: > I'm not too current on current bsd going-ons, but I'd bet that ffs2 > has something to offer in this arena, too, since it looks like it > almost does extent-based allocation now. From barry at python.org Wed Oct 29 23:41:37 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 29 23:41:47 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <13955.1067487452@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <13955.1067487452@kanga.nu> Message-ID: <1067488897.5295.58.camel@anthem> On Wed, 2003-10-29 at 23:17, J C Lawrence wrote: > I'm still debating if I can run down there on the 8th. I'd love to go > to EuroQuest, but I also really need to be in Providence on the 7th, and > back at work on the 9th. Aaaarrrgh. _IF_ I can make it we must go hit > a pub with whiteboards in hand. Sounds great. Bring a laptop and we'll bang out some code (anyone else up for a mini-Mailman-3 sprint at my house? :). I'll probably be heading to Fedex Field on the 9th for a 'Skins game, so the 8th would be perfect. > > ...as well as implement a bulk mailer to eliminate the need for an > > outgoing mail server. > > Eeeek! I trust this would be for immediate handoff to a "real" MTA > versus handling final delivery directly? Quite the Pandora's box if > not. Yep, which makes me nervous, but which does have a certain standalone-ability appeal. I don't want to write it off, and of course, we'll have an interface for this so the first (only?) implementation will be MTA hand-off. > BTW Whatever happened to Michel Pelletier's interfaces PEP? I see the > draft, and I see signs that something got done, but not what... Dead in the water AFAIK. But there are lots of folks using a more formal interface system for Python applications, such as for Zope3. Just writing the interface down, with good docstrings, goes a long way. > Last I heard Nigel was still running screaming into the hills. Hey, I love Exim -- Greg's done some very cool stuff with it on mail.{python,zope}.org. But man, I find it hard to track down just the right knob I need to tweak. :) > Aye, that's something the Plone folk have been digging at with some > success: a base library of waffle-stomp configuration patterns. I'm not > sure for Mailman if we want just a picklist, or a very simple wizard. I haven't even thought about how to surface it in the u/i -- it's mostly machinery right now. But yeah, a wizard is just the ticket, at least for canned styles (which again, will solve 80% of the problem). Which reminds me -- I'm really hoping we can get some web u/i jockies and CSS geeks in to eventually make things real purty. Dammit Jim, I'm a musician, not a graphic artist. :) -Barry From claw at kanga.nu Wed Oct 29 23:43:55 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 23:44:00 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Brad Knowles of "Thu, 30 Oct 2003 05:15:58 +0100." References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <10134.1067484470@kanga.nu> <12670.1067486461@kanga.nu> Message-ID: <16067.1067489035@kanga.nu> On Thu, 30 Oct 2003 05:15:58 +0100 Brad Knowles wrote: > At 11:01 PM -0500 2003/10/29, J C Lawrence wrote: >> With a write-once system you don't actually need to ever move >> anything. > Depends on how you manage the storage of those large files. If you > have an infinitely large filesystem that is guaranteed 100% reliable > in all possible circumstances, you're right. Otherwise, you might > find that the filesystem is getting full and things need to be moved > around, or you suffer a disk or storage system crash and you have to > restore from backups, or you use an HSM solution to move older files > to slower/higher capacity storage, or you have issues with too many > large files in a single directory and need to implement your own > directory hashing scheme, etc.... True, but most of those really end up being a meta-indexing problem. You have many big files. You have indexes which point into those many big files. Occasionally you move those big files about, so your meta-indexes need to be changed point to the new locations of the big files, but the same offsets within the big files... Its really not an expensive or difficult space. If you really need to move individual messages about between file blobs at a respectable rate, then you're in another world of pain, but we don't have any evidence of that requirement, or that such a requirement can't be handled by simply unrolling the big file and respooling the individual messages onto the ends of other big files in different locations. >> Not really. The percentage of such deleted posts over the lifetime >> of the store can be generally assumed to be less than 1 in 10^5, and >> is probably considerably lower, if not in the 1:10^8 range. Add a >> simple invalid key semantic and you're done. > It depends on whether or not the court order allows you to just mark > things as "deleted" and be done with it. If they force you to > actually expunge all copies of that data from your systems, you will > have to do more work. Ahem. for key in list_of_bad_message_keys: big_file, offset, length = get_message_big_file (key) handle = open (big_file) handle.seek (offset) handle.write (' ', length) handle.close () key.invalidate () Not a whole lot more complexity. You're just invalidating the pointed-to data as well as the key. You're still not doing free space management. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Wed Oct 29 23:46:20 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 23:46:35 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Barry Warsaw of "Wed, 29 Oct 2003 23:27:46 EST." <1067488066.5295.47.camel@anthem> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <0FAEFA89-0A86-11D8-9559-0003934516A8@plaidworks.com> <1067488066.5295.47.camel@anthem> Message-ID: <16253.1067489180@kanga.nu> On Wed, 29 Oct 2003 23:27:46 -0500 Barry Warsaw wrote: > On Wed, 2003-10-29 at 22:06, Chuq Von Rospach wrote: > We should just throw everything into a ZODB FileStorage Data.fs file, > and let it grow to gigs in size <1/2 wink>. There are good reasons I use DirectoryStorage: $ find /var/lib/zope/instance/default/var/Data_fs_dir -type f | wc -l 499266 Lotsa little teensy files! -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From barry at python.org Wed Oct 29 23:50:22 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 29 23:50:43 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> Message-ID: <1067489421.5295.67.camel@anthem> On Wed, 2003-10-29 at 23:26, Brad Knowles wrote: > There, I have to disagree. Both the web server and the mail > server issues are complex enough that I don't believe it would be a > good idea to try and re-invent this wheel. There are already enough > bad web server and mail server implementations out there -- we don't > need to make this situation worse. Let's not discount the integration problems, which are a huge headache for newbies. I'm fairly certain that Twisted is the right approach for surfacing the web u/i to Mailman. The requirements are not overwhelming and fronting Mailman's u/i with Apache really doesn't buy us that much. We all agree that CGI sucks, and we could make that better with mod_python or some other such glue, but why go to the trouble? Relying on Twisted for the incoming mail protocols is something I'm less certain about, although there is a lot of appeal to this approach. We could throw lots smarts into a Python port-25 listener, including global spam fighting and bounce processing. An approach like Exim + elspy affords some really cool possibilities. A bigger negative is that there's less precedence for proxying smtpd as there is for httpd, so it's harder to fit Mailman into the mix with an existing mail server. -Barry From barry at python.org Wed Oct 29 23:53:19 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 29 23:53:29 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> Message-ID: <1067489599.5295.71.camel@anthem> On Wed, 2003-10-29 at 23:36, Chuq Von Rospach wrote: > I've experimented with direct-out-the-pipe delivery systems. Trust me, > you don't want to go there. It's not trivial. Well, it's trivial for > 90% of the world that follows the RFCs and behaves as expected and has > the right DNS setups and isn't trying to outsmart spammers by being > stupid. and you'll spend the other 90% of your time trying to build > compatibility in with the other 10%. Chuq, do you think it would be feasible for Mailman to try to handle that 90% itself, and then only hand-off to a Real MTA when it runs into trouble with the other 10% -- assuming it could know when it runs into trouble. Also, there's incoming SMTP and outgoing SMTP. It may be possible to build in support for one direction without providing the other. (It also may not be worth it.) -Barry From claw at kanga.nu Wed Oct 29 23:55:34 2003 From: claw at kanga.nu (J C Lawrence) Date: Wed Oct 29 23:55:46 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Chuq Von Rospach of "Wed, 29 Oct 2003 20:37:49 PST." References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <0FAEFA89-0A86-11D8-9559-0003934516A8@plaidworks.com> <1067488066.5295.47.camel@anthem> Message-ID: <16933.1067489734@kanga.nu> On Wed, 29 Oct 2003 20:37:49 -0800 Chuq Von Rospach wrote: > On Oct 29, 2003, at 8:27 PM, Barry Warsaw wrote: > and don't forget, a single monolithic storage file gets backed up > fully every time you change it. The guy in charge of buying tapes to > back up your system just screamed in agony, since there's no > possibility of an incremental backup for what is 99.9999999% static > data. Ha! So just why do you think I moved off FileStorage for Data.fs? That said there's some value in getting a versioning data store with rollback support for list configs. The data volume isn't huge, but it is highly sensitive. I'd also like to see flat text logging of all configuration changes in addition to moderation activity. It would save the help and support desks a lot of hurt. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From barry at python.org Wed Oct 29 23:57:03 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 29 23:57:12 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <20759E24-0A93-11D8-A49B-0003934516A8@plaidworks.com> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <1067485679.5295.17.camel@anthem> <20031030043514.GP24088@lenin.nu> <20759E24-0A93-11D8-A49B-0003934516A8@plaidworks.com> Message-ID: <1067489822.5295.74.camel@anthem> On Wed, 2003-10-29 at 23:40, Chuq Von Rospach wrote: > And windows? Hey, ignoring Windows has been a successful strategy so far, why stop now? Plus, Longhorn will save us all, right? Oh, and Everything Will Be Faster Next Year Anyway. -Barry From brad.knowles at skynet.be Wed Oct 29 23:51:32 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 23:57:43 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <20031030043514.GP24088@lenin.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <1067485679.5295.17.camel@anthem> <20031030043514.GP24088@lenin.nu> Message-ID: At 8:35 PM -0800 2003/10/29, Peter C. Norton wrote: > [ A cursory google search indicates that hashed dirs, extents, and > journalling are all in linux xfs. I can't imagine an unsupported > feature making its way into the filesystem that SGI is putting on its > latest and greatest systems, but if you know about this, please share ] My understanding is that the port of XFS to Linux was only about 70% done at the time the critical software engineers were laid off by SGI, and that no further work in this area has been done. Maybe the features are supposedly there but incomplete. > However reiserfs was built to handle a mix of lots of small files, ala > maildir or mh spools. I'm sorry, I don't trust ReiserFS at all. I'd trust XFS if it was on Irix, or IBMs JFS, but not ReiserFS. Hell, on a Linux system, I'd use ext2fs before I'd use Reiser. > I'm not too current on current bsd going-ons, but I'd bet that ffs2 > has something to offer in this arena, too, since it looks like it > almost does extent-based allocation now. No, not yet. There are improvements in the areas of handling synchronous meta-data updates, background fsck, etc... but nothing like extent-based filesystems or integrated hashed directory schemes, etc.... > Err... then to relate this to a prior post, why not just use maildirs > on filesystems that are engineered to handle that sort of thing? Because we can't guarantee that everyone (or anyone) would be willing/able to use the selected filesystems that we have blessed? You think requiring everyone to install PostgreSQL would be bad, do you really want to try to force them all to use ReiserFS on Linux as their only supported option? > Unless you are using a filesystem that works for this, right? Like > xfs, vxfs, reiserfs, and probably ffs2. I believe that linux's ext3 > has support for hashing directories (or soon will - I don't precisely > know as I've been focusing on other things) My understanding is that ext3fs is dead. The work that Stephen Tweedie had been doing stopped long ago, and even then it was only a minor tweak over ext2fs. I don't believe that this work has been picked up again or extended to include other features. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From brad.knowles at skynet.be Wed Oct 29 23:56:19 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Wed Oct 29 23:57:52 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> Message-ID: At 8:36 PM -0800 2003/10/29, Chuq Von Rospach wrote: > I've experimented with direct-out-the-pipe delivery systems. Trust me, > you don't want to go there. It's not trivial. Well, it's trivial for > 90% of the world that follows the RFCs and behaves as expected and has > the right DNS setups and isn't trying to outsmart spammers by being > stupid. and you'll spend the other 90% of your time trying to build > compatibility in with the other 10%. You'd have the same sorts of problems if you added your own interpretation of the MIME bodyparts and stored the attachments separately, and then tried to re-integrate everything on transmission. Indeed, you'd have a whole host of additional problems you'd add because not only would you be trying to format everything on output so that everyone would like what you send, you'd be assuming that you can always correctly parse your inputs and correctly handle the results. IMO, you're much better just storing exactly what you got, and then sending exactly what you stored when the time comes. Any misunderstandings are therefore the fault of the sender or recipient, and not the result of anything you added to the complexity mix. Or have I missed something here? -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From chuqui at plaidworks.com Wed Oct 29 23:59:56 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Thu Oct 30 00:00:29 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067489599.5295.71.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> <1067489599.5295.71.camel@anthem> Message-ID: On Oct 29, 2003, at 8:53 PM, Barry Warsaw wrote: > Chuq, do you think it would be feasible for Mailman to try to handle > that 90% itself, and then only hand-off to a Real MTA when it runs into > trouble with the other 10% -- assuming it could know when it runs into > trouble. > I think you have enough on your plate to not re-invent what others have already done pretty well. When you run out of features to implement, then think about this. Not until. From barry at python.org Thu Oct 30 00:00:40 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 30 00:00:49 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <0FAEFA89-0A86-11D8-9559-0003934516A8@plaidworks.com> <1067488066.5295.47.camel@anthem> Message-ID: <1067490040.5295.79.camel@anthem> On Wed, 2003-10-29 at 23:37, Chuq Von Rospach wrote: > > until you have to split it across two disks because one is full. > > and don't forget, a single monolithic storage file gets backed up fully > every time you change it. The guy in charge of buying tapes to back up > your system just screamed in agony, since there's no possibility of an > incremental backup for what is 99.9999999% static data. > Actually, newer versions of ZODB have a script called repozo.py which makes incremental backups feasible. It knows a lot about FileStorage's formats. Also note that there are alternative storage implementations such as BerkeleyDB-based storage (slow, but presumably more reliable) and the 3rd party DirectoryStorage. We'll talk about databases in another thread. I have my own biases, but I'm too tired now to get into it. -Barry From barry at python.org Thu Oct 30 00:01:33 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 30 00:01:41 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <16933.1067489734@kanga.nu> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <0FAEFA89-0A86-11D8-9559-0003934516A8@plaidworks.com> <1067488066.5295.47.camel@anthem> <16933.1067489734@kanga.nu> Message-ID: <1067490092.5295.81.camel@anthem> On Wed, 2003-10-29 at 23:55, J C Lawrence wrote: > That said there's some value in getting a versioning data store with > rollback support for list configs. +1 > The data volume isn't huge, but it > is highly sensitive. I'd also like to see flat text logging of all > configuration changes in addition to moderation activity. +1 -Barry From barry at python.org Thu Oct 30 00:02:35 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 30 00:02:42 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <16253.1067489180@kanga.nu> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <0FAEFA89-0A86-11D8-9559-0003934516A8@plaidworks.com> <1067488066.5295.47.camel@anthem> <16253.1067489180@kanga.nu> Message-ID: <1067490154.5295.83.camel@anthem> On Wed, 2003-10-29 at 23:46, J C Lawrence wrote: > There are good reasons I use DirectoryStorage: > > $ find /var/lib/zope/instance/default/var/Data_fs_dir -type f | wc -l > 499266 > > Lotsa little teensy files! :) -Barry From claw at kanga.nu Thu Oct 30 00:08:33 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 00:08:38 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Barry Warsaw of "Wed, 29 Oct 2003 23:50:22 EST." <1067489421.5295.67.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <1067489421.5295.67.camel@anthem> Message-ID: <18148.1067490513@kanga.nu> On Wed, 29 Oct 2003 23:50:22 -0500 Barry Warsaw wrote: > On Wed, 2003-10-29 at 23:26, Brad Knowles wrote: >> There, I have to disagree. Both the web server and the mail server >> issues are complex enough that I don't believe it would be a good >> idea to try and re-invent this wheel. There are already enough bad >> web server and mail server implementations out there -- we don't need >> to make this situation worse. > Let's not discount the integration problems, which are a huge headache > for newbies. I thought the prevalence of canned Mailman packages was doing a lot there? I haven't watched the -users list in a while. > I'm fairly certain that Twisted is the right approach for surfacing > the web u/i to Mailman. The requirements are not overwhelming and > fronting Mailman's u/i with Apache really doesn't buy us that much. Hang-on. Apache isn't the target. Mailman's UI is a CGI app. As such it works with any web server that supports CGI-bin, which pretty much means any web server with no exceptions. That's a pretty large gain, especially in the novice admin or simple deployment case territory. Doing our own thing for HTTP handling can quickly be another Pandora's box, security concern, and integration problem for the (majority of) people who do want to run Apache/Boa/Thttpd/Zeus/etc. > We all agree that CGI sucks, and we could make that better with > mod_python or some other such glue, but why go to the trouble? CGI sucks yes, but it is the guaranteed common denominator, and CD counts for more than feature whiz-bang at this level. > Relying on Twisted for the incoming mail protocols is something I'm > less certain about, although there is a lot of appeal to this > approach. -1 Tarbaby, pandora's box, security nightmare, unbounded security envelope. > We could throw lots smarts into a Python port-25 listener, including > global spam fighting and bounce processing. You ___really___ don't want to get into your own SMTP-level bounce processing. Really. That's one huge endlessly sucking time sinker. Let Phillip Hazel, Wietse and the rest spend their time there. > An approach like Exim + elspy affords some really cool possibilities. Absolutely, but that is outside of Mailman's territory. More interesting would be things like TMDA integration, or implementing support for Yakov Shafranovich extension of my consent token protocol: http://www.ietf.org/internet-drafts/draft-irtf-asrg-cri-00.txt Getting early buy-in as a sample implementation for an MLM wouldn't be a Bad Thing. There's a lot of really neat and useful integration and feature set territory to explore before you start staring down the MTA's throat. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From brad.knowles at skynet.be Wed Oct 29 23:59:48 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Thu Oct 30 00:09:13 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <16067.1067489035@kanga.nu> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <10134.1067484470@kanga.nu> <12670.1067486461@kanga.nu> <16067.1067489035@kanga.nu> Message-ID: At 11:43 PM -0500 2003/10/29, J C Lawrence wrote: > True, but most of those really end up being a meta-indexing problem. Fair enough. > Not a whole lot more complexity. You're just invalidating the > pointed-to data as well as the key. You're still not doing free space > management. What about your backups? And your off-site backups? And your mirror sites around the world? Any other copies of those files that might have been copied off somewhere else? -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From brad.knowles at skynet.be Thu Oct 30 00:08:32 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Thu Oct 30 00:09:20 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <1067489599.5295.71.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> <1067489599.5295.71.camel@anthem> Message-ID: At 11:53 PM -0500 2003/10/29, Barry Warsaw wrote: > Chuq, do you think it would be feasible for Mailman to try to handle > that 90% itself, and then only hand-off to a Real MTA when it runs into > trouble with the other 10% -- assuming it could know when it runs into > trouble. Bryan Costales and Eric Allman had this debate at InfoBeat/Mercury Mail. Bryan said that he could write a better "simple" MTA that could handle the easy 80% and leave the hard 20% to sendmail. Eric showed that he could improve sendmail to the point where it would perform at or near the level of performance of Bryan's code without throwing everything out, and would out-perform every other aspect of the system in question (so that the MTA was no longer the bottleneck at any stage). I'm confident that the same sort of approach is appropriate for other well-respected MTAs (e.g., postfix, and exim in my personal experience). > Also, there's incoming SMTP and outgoing SMTP. It may be possible to > build in support for one direction without providing the other. (It > also may not be worth it.) It's hard enough writing an incoming SMTP handler, and doing it right. Many large service providers have seriously screwed up when trying to do so (bigfoot anyone?), and others have only implemented half of the inbound solution (AOL), leaving the harder parts to standard programs like sendmail. Even then I argued violently against this approach at AOL, and felt that we could do a better job by leaving all the external interfacing/queueing issues to sendmail, and instead make the in-house developed code an LMTP Local Delivery Agent. I was over-ruled, primarily because we had already gone too far down the road that had been chosen for us. Note that none of the original Internet Mail Operations team members are left at AOL (almost all bugged out when the new mail server software came online), and I don't think any of the original Internet Mail Development team members are left, either. Bad Juju, Bwana. I've been down this road before. Trust me, you don't want to do this. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From claw at kanga.nu Thu Oct 30 00:10:08 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 00:10:12 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Chuq Von Rospach of "Wed, 29 Oct 2003 20:59:56 PST." References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> <1067489599.5295.71.camel@anthem> Message-ID: <18294.1067490608@kanga.nu> On Wed, 29 Oct 2003 20:59:56 -0800 Chuq Von Rospach wrote: > On Oct 29, 2003, at 8:53 PM, Barry Warsaw wrote: >> Chuq, do you think it would be feasible for Mailman to try to handle >> that 90% itself, and then only hand-off to a Real MTA when it runs >> into trouble with the other 10% -- assuming it could know when it >> runs into trouble. > I think you have enough on your plate to not re-invent what others > have already done pretty well. When you run out of features to > implement, then think about this. Not until. Seconded, in spades. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From chuqui at plaidworks.com Thu Oct 30 00:10:57 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Thu Oct 30 00:11:30 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <10134.1067484470@kanga.nu> <12670.1067486461@kanga.nu> <16067.1067489035@kanga.nu> Message-ID: <71FA49B6-0A97-11D8-AE6D-0003934516A8@plaidworks.com> engineering details. On Oct 29, 2003, at 8:59 PM, Brad Knowles wrote: > > What about your backups? And your off-site backups? And your mirror > sites around the world? Any other copies of those files that might > have been copied off somewhere else? From chuqui at plaidworks.com Thu Oct 30 00:16:06 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Thu Oct 30 00:17:17 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> <1067489599.5295.71.camel@anthem> Message-ID: <29DE2A2C-0A98-11D8-AE6D-0003934516A8@plaidworks.com> On Oct 29, 2003, at 9:08 PM, Brad Knowles wrote: > Bryan Costales and Eric Allman had this debate at InfoBeat/Mercury > Mail. Bryan said that he could write a better "simple" MTA that could > handle the easy 80% and leave the hard 20% to sendmail. There is no such thing as a simple MTA. This gets hairy quickly. Really quickly. you are much better off spending money on a good fast disk RAID (since the chances that you'll win the lottery are on par with the chances that your bottleneck is NOT disk I/O in mail sending) than on a programmer to try to build fast MTAs. > that none of the original Internet Mail Operations team members are > left at AOL (almost all bugged out when the new mail server software > came online), and I don't think any of the original Internet Mail > Development team members are left, either. > And boy, does it show. From claw at kanga.nu Thu Oct 30 00:21:52 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 00:22:15 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Brad Knowles of "Thu, 30 Oct 2003 05:59:48 +0100." References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <10134.1067484470@kanga.nu> <12670.1067486461@kanga.nu> <16067.1067489035@kanga.nu> Message-ID: <19296.1067491312@kanga.nu> On Thu, 30 Oct 2003 05:59:48 +0100 Brad Knowles wrote: > At 11:43 PM -0500 2003/10/29, J C Lawrence wrote: >> Not a whole lot more complexity. You're just invalidating the >> pointed-to data as well as the key. You're still not doing free >> space management. > What about your backups? And your off-site backups? And your mirror > sites around the world? Any other copies of those files that might > have been copied off somewhere else I'm not going to touch the aspects of attempting to rewrite the data in backup sets without invalidating the backups. Uhh uhh. No deal. I'm also not going to touch the management of data that has been copied outside of the store's purview. Its no longer in the store's scope and so isn't really under discussion. I can run strings on my Oracle tables as well, but that really doesn't make the resulting data files part of Oracle's data-management model. At its core this is a snapshot issue. What you're really arguing for is the ability to revert, recover, or synchronise (they're all the same thing under the covers) the state of the store in a logically consistent fashion. As such you're interested in logical consistency for not just one Big File, but across files, and across the meta-indexes; logical consistency of the store as a whole. This really isn't a storage format problem. Its a transaction framing problem and a snapshotting problem (which is really jut a transaction framing problem). You need to not only know the state of the data files, but the state of the meta-indexes, and that they are synchronised with each other. This is not a trivial space, but its also not an unknown space. File versioning systems have been messing here for years with change keys and and signatures. Ultimately it comes down to a shared transaction key. The old AT&T SCCS papers are a particularly good read in this regard. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Thu Oct 30 00:24:45 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 00:24:49 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Brad Knowles of "Thu, 30 Oct 2003 06:08:32 +0100." References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> <1067489599.5295.71.camel@anthem> Message-ID: <19535.1067491485@kanga.nu> On Thu, 30 Oct 2003 06:08:32 +0100 Brad Knowles wrote: > At 11:53 PM -0500 2003/10/29, Barry Warsaw wrote: > Note that none of the original Internet Mail Operations team members > are left at AOL (almost all bugged out when the new mail server > software came online), and I don't think any of the original Internet > Mail Development team members are left, either. Eeek. Not fun. > I've been down this road before. Trust me, you don't want to do this. Barry, listen to this man. He speaks sooth. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From chuqui at plaidworks.com Thu Oct 30 00:31:51 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Thu Oct 30 00:32:28 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <19296.1067491312@kanga.nu> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <10134.1067484470@kanga.nu> <12670.1067486461@kanga.nu> <16067.1067489035@kanga.nu> <19296.1067491312@kanga.nu> Message-ID: <5D459A0C-0A9A-11D8-AE6D-0003934516A8@plaidworks.com> On Oct 29, 2003, at 9:21 PM, J C Lawrence wrote: > This is not a trivial space, but its also not an unknown space. File > versioning systems have been messing here for years with change keys > and > and signatures. Ultimately it comes down to a shared transaction key. > The old AT&T SCCS papers are a particularly good read in this regard. > How does this statement reconcile with Barry's not wanting to require MySQL or PostgreSQL for Mailman because he doesn't want to layer on too many dependencies to get Mailman running? We seem to be heading off into places where the answer is "if we're lucky, it'll run on that cluster of G5's at Uvirginia -- slowly". Unless Barry wants to throw his simplicity requirements out the window, we can't expect high performance filesystems, SANs, fiber optic RAID connects, or for that matter, linux over windows over sgi over solaris 2.5. This stuff that's floating around is great, if we were writing an enterprise-class, mega-bugger IS-supported system for a corporate data center. How's taht all relate to Mailman, anyway? Maybe we should refocus and not wander down interesting but entirely philosophical ratholes? From claw at kanga.nu Thu Oct 30 00:32:40 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 00:32:44 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Barry Warsaw of "Wed, 29 Oct 2003 23:41:37 EST." <1067488897.5295.58.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <13955.1067487452@kanga.nu> <1067488897.5295.58.camel@anthem> Message-ID: <20157.1067491960@kanga.nu> On Wed, 29 Oct 2003 23:41:37 -0500 Barry Warsaw wrote: > On Wed, 2003-10-29 at 23:17, J C Lawrence wrote: > Sounds great. Bring a laptop and we'll bang out some code (anyone > else up for a mini-Mailman-3 sprint at my house? :). I'll probably be > heading to Fedex Field on the 9th for a 'Skins game, so the 8th would > be perfect. Hopefully I know by this Sunday. Will see. >> Eeeek! I trust this would be for immediate handoff to a "real" MTA >> versus handling final delivery directly? Quite the Pandora's box if >> not. > Yep... In what way would this be different from the current SMTP delivery supports? >> BTW Whatever happened to Michel Pelletier's interfaces PEP? I see >> the draft, and I see signs that something got done, but not what... > Dead in the water AFAIK. Ahh. >> Last I heard Nigel was still running screaming into the hills. > Hey, I love Exim -- Greg's done some very cool stuff with it on > mail.{python,zope}.org. But man, I find it hard to track down just > the right knob I need to tweak. :) Hehn. I like Exim a lot, and compared to the competition the documentation is superb. I got a note after my "Mailman doesn't have enough config options" that he'd, err, had a somewhat explosive reaction. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Thu Oct 30 00:40:59 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 00:41:07 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Chuq Von Rospach of "Wed, 29 Oct 2003 21:31:51 PST." <5D459A0C-0A9A-11D8-AE6D-0003934516A8@plaidworks.com> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <10134.1067484470@kanga.nu> <12670.1067486461@kanga.nu> <16067.1067489035@kanga.nu> <19296.1067491312@kanga.nu> <5D459A0C-0A9A-11D8-AE6D-0003934516A8@plaidworks.com> Message-ID: <20770.1067492459@kanga.nu> On Wed, 29 Oct 2003 21:31:51 -0800 Chuq Von Rospach wrote: > On Oct 29, 2003, at 9:21 PM, J C Lawrence wrote: > How's taht all relate to Mailman, anyway? Maybe we should refocus and > not wander down interesting but entirely philosophical ratholes? Agreed, but then I've said my piece several times on those scores. We need a requirements definition for the abstractions for storage, indexing and presentation. I've already stated my bits there. So far there's been neither argument or commentary, just a bunch of cross-purposes violent agreement between Brad and me. While I like a netnews model as it suits my needs, I really don't care what the store is so long as it solves the problems I've laid out. We need a priori key determination, a collision policy, key handoffs to an indexer (which could be NULL in Chuq's MySQL case), and an improved/adapted presentation layer. I've already said my bits there and proposed what I see as the cheap, easy, incremental improvement course: Twisted's NNTP supports for storage, Message IDs for keys, a variant best-effort detection and rewriting policy for collisions, and a MeoWWW derivative for HTML presentation/posting. Counters? -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Thu Oct 30 00:49:49 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 00:49:52 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Brad Knowles of "Thu, 30 Oct 2003 05:00:48 +0100." References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <1067485679.5295.17.camel@anthem> Message-ID: <21418.1067492989@kanga.nu> On Thu, 30 Oct 2003 05:00:48 +0100 Brad Knowles wrote: > At 10:47 PM -0500 2003/10/29, Barry Warsaw wrote: > SGIs XFS on Irix does a pretty good job, with hashed directory > structures, and an extent-based journaling filesystem. ReiserFS also does particularly well here. I haven't yet tested IBM's JFS. Last time I hit VxFS hard (back in the HP-UX 20.20 days) it really didn't like huge directories, but that may have changed since then. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From brad.knowles at skynet.be Thu Oct 30 00:51:29 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Thu Oct 30 00:53:58 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <29DE2A2C-0A98-11D8-AE6D-0003934516A8@plaidworks.com> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> <1067489599.5295.71.camel@anthem> <29DE2A2C-0A98-11D8-AE6D-0003934516A8@plaidworks.com> Message-ID: At 9:16 PM -0800 2003/10/29, Chuq Von Rospach wrote: > There is no such thing as a simple MTA. This gets hairy quickly. > Really quickly. Bryan is one of the few people I would expect to be able to do something that could actually handle the easy 80%. Writing the book _sendmail_ (now in its fourth edition) is just one of his many talents. > you are much better off spending money on a good fast disk RAID (since > the chances that you'll win the lottery are on par with the chances > that your bottleneck is NOT disk I/O in mail sending) than on a > programmer to try to build fast MTAs. They were already using pure RAM disks for this application. Disk I/O was not the problem. Bryan and Eric were two major contributors to my invited talks "Sendmail Performance Tuning for Large Systems" (see ) and "Design and Implementation of Highly Scalable E-mail Systems" (see ). These guys are not lightweights in this field. > And boy, does it show. Indeed. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From claw at kanga.nu Thu Oct 30 00:56:29 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 00:56:33 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Brad Knowles of "Thu, 30 Oct 2003 05:51:32 +0100." References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <1067485679.5295.17.camel@anthem> <20031030043514.GP24088@lenin.nu> Message-ID: <21929.1067493389@kanga.nu> On Thu, 30 Oct 2003 05:51:32 +0100 Brad Knowles wrote: > I'm sorry, I don't trust ReiserFS at all. I'd trust XFS if it was on > Irix, or IBMs JFS, but not ReiserFS. Hell, on a Linux system, I'd use > ext2fs before I'd use Reiser. I'll simply note that I've been using ReiserFS on just over a dozen systems ranging from million+ messages a day list servers to build, dev, web, and desktop boxes. I've yet to have problems. > ... do you really want to try to force them all to use ReiserFS on > Linux as their only supported option? Err, want or consider reasonable? > My understanding is that ext3fs is dead. I'd thought that Ted T'so took over some of the reins in his move to IBM, but I haven't chatted to him in a long whiles. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Thu Oct 30 01:02:06 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 01:02:11 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from J C Lawrence of "Thu, 30 Oct 2003 00:56:29 EST." <21929.1067493389@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <1067485679.5295.17.camel@anthem> <20031030043514.GP24088@lenin.nu> <21929.1067493389@kanga.nu> Message-ID: <22395.1067493726@kanga.nu> On Thu, 30 Oct 2003 00:56:29 -0500 J C Lawrence wrote: > On Thu, 30 Oct 2003 05:51:32 +0100 Brad Knowles > wrote: >> I'm sorry, I don't trust ReiserFS at all. I'd trust XFS if it was on >> Irix, or IBMs JFS, but not ReiserFS. Hell, on a Linux system, I'd >> use ext2fs before I'd use Reiser. > I'll simply note that I've been using ReiserFS on just over a dozen > systems ranging from million+ messages a day list servers to build, > dev, web, and desktop boxes. I've yet to have problems. Err, add in "just under three years". -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From brad.knowles at skynet.be Thu Oct 30 00:56:13 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Thu Oct 30 01:04:44 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <21418.1067492989@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <1067485679.5295.17.camel@anthem> <21418.1067492989@kanga.nu> Message-ID: At 12:49 AM -0500 2003/10/30, J C Lawrence wrote: > ReiserFS also does particularly well here. I haven't yet tested IBM's > JFS. Last time I hit VxFS hard (back in the HP-UX 20.20 days) it really > didn't like huge directories, but that may have changed since then. HP-UX 20.20? I wasn't aware that they had gone much beyond HP-UX 11.x. Did you mean HP-UX 10.20? Now that's a beast I remember, and remember loathing with a passion. HP-UX 9 was slow, but rock-solid -- no matter how hard you beat on the damn thing, it just slowed down but never stopped. HP-UX 10.x was a real dog. HP-UX 11.x looked like it was going to shape up better, but then I got out of AOL before we had many of those systems in house. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From brad.knowles at skynet.be Thu Oct 30 01:04:19 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Thu Oct 30 01:04:47 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <20770.1067492459@kanga.nu> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <10134.1067484470@kanga.nu> <12670.1067486461@kanga.nu> <16067.1067489035@kanga.nu> <19296.1067491312@kanga.nu> <5D459A0C-0A9A-11D8-AE6D-0003934516A8@plaidworks.com> <20770.1067492459@kanga.nu> Message-ID: At 12:40 AM -0500 2003/10/30, J C Lawrence wrote: > We > need a priori key determination, a collision policy, key handoffs to an > indexer (which could be NULL in Chuq's MySQL case), and an > improved/adapted presentation layer. As far as this goes, I agree. > I've already said my bits there > and proposed what I see as the cheap, easy, incremental improvement > course: Twisted's NNTP supports for storage, Message IDs for keys, a > variant best-effort detection and rewriting policy for collisions, and a > MeoWWW derivative for HTML presentation/posting. I don't know anything about Twisted or MeoWWW, so I can't say how they address the subjects above. I can say that I'm not sure about an NNTP-based storage solution, although certain storage techniques we've recently discussed borrow a lot from extant NNTP implementations, and I'm not sure how much sense it would make to rip out just those parts we know we need, or if we could actually reasonably take the whole thing, kit-n-caboodle. I do believe that we need an alternative solution to the message-id header as it was presented to us in the message, as a stable guaranteed unique (well, as good as MD-5 or SHA-1 gets) message identifier that can always be used to refer to the exact same message no matter what. Whether we use this message identifier as a replacement for the message-id header value as it was presented to us -- I think that's a more philosophical discussion, and I think we should address it by allowing both options but deciding which would be a reasonable default to take. Given that the mailman UI is basically completely contained within the CGI, I'm inclined to leave it there and work on improving it internally, allowing us to continue to work with most any webserver the client may have. I don't know how MeoWWW addresses this issue, either by replacing the webserver, or providing additional tools that may make it easier to present a good and consistent UI. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From barry at python.org Thu Oct 30 09:15:35 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 30 09:15:44 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <18148.1067490513@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <1067489421.5295.67.camel@anthem> <18148.1067490513@kanga.nu> Message-ID: <1067523334.5295.93.camel@anthem> On Thu, 2003-10-30 at 00:08, J C Lawrence wrote: > Hang-on. Apache isn't the target. Mailman's UI is a CGI app. As such > it works with any web server that supports CGI-bin, which pretty much > means any web server with no exceptions. That's a pretty large gain, > especially in the novice admin or simple deployment case territory. Sure, but I suspect that plumbing Mailman out to http will be just a proxy rule away from integrating with an existing web server. That's not without its headaches too, but should be as widely supported. > Doing our own thing for HTTP handling can quickly be another Pandora's > box, security concern, and integration problem for the (majority of) > people who do want to run Apache/Boa/Thttpd/Zeus/etc. We do need to worry about the security of the http framework (e.g. Twisted), but past that, it's still our responsibility. I mostly see this as a thin veneer between the web and the core logic for Mailman. Wanna use CGI? I suspect it's just a little extra glue. Same goes for mod_python or whatever. > > An approach like Exim + elspy affords some really cool possibilities. > > Absolutely, but that is outside of Mailman's territory. Definitely for now, that's for sure. I don't want to write it off completely, but we need to be practical too. > More interesting would be things like TMDA integration, or implementing > support for Yakov Shafranovich extension of my consent token protocol: > > http://www.ietf.org/internet-drafts/draft-irtf-asrg-cri-00.txt > > Getting early buy-in as a sample implementation for an MLM wouldn't be a > Bad Thing. There's a lot of really neat and useful integration and > feature set territory to explore before you start staring down the MTA's > throat. Sure. I just skimmed the CRI draft, but here's some questions (hmm, if you answer this please start a new thread). If you send 10 messages to a list within 10 minutes and I've never heard of you before, should I send you 10 challenges or one? If I send you 10, should I consider a response to any one of them good enough to free all 10 posts? Also, isn't any CRI system going to have to have mail bomb defenses? -Barry From claw at kanga.nu Thu Oct 30 09:38:19 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 09:38:25 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Barry Warsaw of "Thu, 30 Oct 2003 09:15:35 EST." <1067523334.5295.93.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <1067489421.5295.67.camel@anthem> <18148.1067490513@kanga.nu> <1067523334.5295.93.camel@anthem> Message-ID: <28587.1067524699@kanga.nu> On Thu, 30 Oct 2003 09:15:35 -0500 Barry Warsaw wrote: > On Thu, 2003-10-30 at 00:08, J C Lawrence wrote: >> Hang-on. Apache isn't the target. Mailman's UI is a CGI app. As >> such it works with any web server that supports CGI-bin, which pretty >> much means any web server with no exceptions. That's a pretty large >> gain, especially in the novice admin or simple deployment case >> territory. > Sure, but I suspect that plumbing Mailman out to http will be just a > proxy rule away from integrating with an existing web server. That's > not without its headaches too, but should be as widely supported. Considerably more web servers support CGI-bin than support proxy rules. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From barry at python.org Thu Oct 30 09:53:19 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 30 09:53:26 2003 Subject: Efficient final message disposition (was Re: [Mailman-Developers] Requirements for a new archiver) In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> <1067489599.5295.71.camel@anthem> Message-ID: <1067525599.5295.126.camel@anthem> Ok, I'm beat up enough, so let me open things up to a hopefully more productive thread. How can Mailman more efficiently hand off messages to a local mail server for final delivery? Some problems with the current approach include: - The desire/requirement that Mailman chunk and sort recipients - The ability for Mailman to swamp the mail server or cause the mail server to consume all available cpu - The fact that failures in upstream mail server are reported to Mailman as bounces instead of as error codes - Inefficiencies in VERP/personalization/mail-merge because of the lack of cooperation - The need for Mailman to queue outgoing messages that aren't completely delivered I'm sure you guys can identify more issues . Look at the complexity in SMTPDirect.py, and even there, we still have problems. So how do we design a system where we can push the complexity and efficiency concerns out past our boundary? Here's a rough sketch of what I'd like: Mailman has a list of recipients, or at least knows how to calculate that list. It has a message template as encoded 7-bit ascii. It has a dictionary (association table, hash table) of substitution placeholders to values for each recipient, or knows how to calculate that. Mailman wants to simply hand that data off to some agent and forget about it. It wants to know that the agent will make best effort to mail merge and deliver. It wants to be informed of any final delivery failures. And that's it. Mailman doesn't want to chunkify recipients, and it doesn't want to sort them. It doesn't want to worry about a mail server effectively managing system resources. I'd rather not have to hand it a couple of meg of recipient or substitution data, but there seems to be no other way. So what can we do here to improve matters? -Barry From claw at kanga.nu Thu Oct 30 10:06:04 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 10:12:03 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Brad Knowles of "Thu, 30 Oct 2003 07:04:19 +0100." References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <10134.1067484470@kanga.nu> <12670.1067486461@kanga.nu> <16067.1067489035@kanga.nu> <19296.1067491312@kanga.nu> <5D459A0C-0A9A-11D8-AE6D-0003934516A8@plaidworks.com> <20770.1067492459@kanga.nu> Message-ID: <30725.1067526364@kanga.nu> On Thu, 30 Oct 2003 07:04:19 +0100 Brad Knowles wrote: > At 12:40 AM -0500 2003/10/30, J C Lawrence wrote: >> I've already said my bits there and proposed what I see as the cheap, >> easy, incremental improvement course: Twisted's NNTP supports for >> storage, Message IDs for keys, a variant best-effort detection and >> rewriting policy for collisions, and a MeoWWW derivative for HTML >> presentation/posting. > I don't know anything about Twisted or MeoWWW, so I can't say how they > address the subjects above. Twisted is a pythonic library that implements most of the basic network protocols. Among other things it has an RFC conformant NNTP server and client implementations. Creating an NNTP server with a backing message store is, literally, three lines in Python. Of course it doesn't support all the nifties that real netnews servers do ala expires, administrative controls, feeds, etc. Its not intended for that market, and Mailman doesn't need those supports. If deployment sites need that, they're going to be using inn2|[BCD}News|Diablo anyway. MeoWWW is a (very inefficient but fixable) pythonic CGI which supports reading and posting to netnews via NNTP. It has various nice UI points, a decent feature set (more than we have now), and does The Right Thing in almost every aspect I've checked except for performance in the spool reads. > I can say that I'm not sure about an NNTP-based storage solution... We should really start out by splitting that discussion. NNTP is an access protocol. Netnews servers have various storage formats and techniques. Currently NNTP and IMAP are the only standardised wide-deployment protocols for message spool access. I'm not interested in IMAP for the reasons previously discussed. NNTP isn't great, but it is already supported by Mailman for the new gating features and adds a clean abstraction model which allows trivial replacement of Mailman's implementation by inn2|[BCD]news|Diablo|whatever should the deployment site wish. Additionally, again as a standards-etc based protocol, it allows clean abstraction for archive presentation: anything that talks NNTP can now be an effective Mailman archive presenter. Ditto for archive indexing. As a dev I'm interested in arguments about how to handle the store behind the NNTP interface -- I find that stuff fun and intriguing -- but also think they are fairly uninteresting right now for Mailman specifically. The 90% case for Mailman will have less than 200K messages in their site-wide spool, and most of those an order of magnitude less. For me the interesting point is that once we abstract the message storage behind a well-supported standards-based protocol we can incrementally improve our implementation and those really concerned with the larger cases can throw in inn2 or whatever else, like a filter to SQL, instead. ITMT we get the flexibility and time to grow and do it Really Right. Additionally, having adopted such a well defined abstraction model once, moving down the road should something else better appear it should be a comparatively small cost to support that in addition or instead. > ... although certain storage techniques we've recently discussed > borrow a lot from extant NNTP implementations, and I'm not sure how > much sense it would make to rip out just those parts we know we need, > or if we could actually reasonably take the whole thing, > kit-n-caboodle. Which may indeed happen. > I do believe that we need an alternative solution to the message-id > header as it was presented to us in the message, as a stable > guaranteed unique (well, as good as MD-5 or SHA-1 gets) message > identifier that can always be used to refer to the exact same message > no matter what. I'm in split minds here. I see the temptation. I like using Message-IDS, and they are a natural fit to the model semantically, but messing with Message-IDs has unpleasant effects for some other systems. > Whether we use this message identifier as a replacement for the > message-id header value as it was presented to us -- I think that's a > more philosophical discussion, and I think we should address it by > allowing both options but deciding which would be a reasonable default > to take. I'm on the side of rewriting Message-IDs if we do generate our own keys. I don't like it, but it seems the cleanest approach. > Given that the mailman UI is basically completely contained within the > CGI, I'm inclined to leave it there and work on improving it > internally, allowing us to continue to work with most any webserver > the client may have. Agreed. > I don't know how MeoWWW addresses this issue, either by replacing the > webserver, or providing additional tools that may make it easier to > present a good and consistent UI. MeoWWW is a CGI as discussed above. Twisted implements both sides of HTTP in addition to the NNTP discussed above, but I haven't looked at the details. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From chuqui at plaidworks.com Thu Oct 30 10:25:59 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Thu Oct 30 10:27:23 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <28587.1067524699@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <1067489421.5295.67.camel@anthem> <18148.1067490513@kanga.nu> <1067523334.5295.93.camel@anthem> <28587.1067524699@kanga.nu> Message-ID: <5D118CE7-0AED-11D8-AE6D-0003934516A8@plaidworks.com> On Oct 30, 2003, at 6:38 AM, J C Lawrence wrote: >> Sure, but I suspect that plumbing Mailman out to http will be just a >> proxy rule away from integrating with an existing web server. That's >> not without its headaches too, but should be as widely supported. > > Considerably more web servers support CGI-bin than support proxy rules. > And think about all of the colo environments where it's getting installed. Proxy stuff may not be welcome there. And you make it difficult for someone to integrate Mailman into a larger site environment where they want to use tools (like mod_layout) to skin things. Do we know what a "typical" installation of Mailman is like? Do we know how it's used? Do we know what kind of hardware it's really running on, or what environments? Do we know what the user base is? What their top ten wish list is? Excuse me for sounding like a product manager, but are these features because they're needed, or because we think they'd be fun to implement? and are we building an upgrade the user base can use, or only alpha geek hardware owners? (and in reality, I think Barry has a good intuitive sense of these issues, but I wanted to have all of us rememeber it, and maybe it wouldn't be a bad idea to get some objective data....) From brad.knowles at skynet.be Thu Oct 30 10:48:28 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Thu Oct 30 10:50:05 2003 Subject: Efficient final message disposition (was Re: [Mailman-Developers] Requirements for a new archiver) In-Reply-To: <1067525599.5295.126.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> <1067489599.5295.71.camel@anthem> <1067525599.5295.126.camel@anthem> Message-ID: At 9:53 AM -0500 2003/10/30, Barry Warsaw wrote: > I'm sure you guys can identify more issues . Look at the > complexity in SMTPDirect.py, and even there, we still have problems. I'm not a programmer, so I can't really help you there. ;-( > So how do we design a system where we can push the complexity and > efficiency concerns out past our boundary? I can say that I think we need to look at all of the recommendations in the following papers: "Tuning Sendmail for Large Mailing Lists" Rob Kolstad Proceedings of LISA '97 http://tinyurl.com/t09c "Drinking from the Fire(walls) Hose: Another Approach to Very Large Mailing Lists" Strata Rose Chalup, Christine Hogan, Greg Kulosa, Bryan McDonald, and Bryan Stansell Proceedings of LISA '98 http://tinyurl.com/t09k There may be others that we need to look at, but of which I am not (yet) aware. If anyone knows of any, please let me know. We're already doing some of the things recommended in these papers, but not everything. And I think there may be a couple more things we can do that are not mentioned, but which would be a further help. However, if you want to hand all this work to an external "final mail-merge delivery agent", this is moot. We just need to make sure that the selected FMMDA addresses all these issues. We could use an existing tool (e.g., bulk_mailer from ), or we could create a separate package to address this issue (of course, that brings the ball back into our court). Or, you could just have Chuq solve this problem for you, as he mentioned in . ;-) > So what can we do here to improve matters? Sounds to me like you want to externalize this whole process. Problem is, bulk_mailer is the only tool I know of that currently exists as a partial attempt to address this problem, although perhaps some additional work on it could fill in the rest. Alternatively, you develop, or work with someone else to develop, an alternative to bulk_mailer that does all the things you want and which can be used as an external tool. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From chuqui at plaidworks.com Thu Oct 30 11:41:17 2003 From: chuqui at plaidworks.com (Chuq Von Rospach) Date: Thu Oct 30 11:43:22 2003 Subject: Efficient final message disposition (was Re: [Mailman-Developers] Requirements for a new archiver) In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> <1067489599.5295.71.camel@anthem> <1067525599.5295.126.camel@anthem> Message-ID: On Oct 30, 2003, at 7:48 AM, Brad Knowles wrote: > "Tuning Sendmail for Large Mailing Lists" > http://tinyurl.com/t09c > 400K/day aggregate max > "Drinking from the Fire(walls) Hose: > http://tinyurl.com/t09k 380K/day aggregate max (yawn. My server's bored. snicker) but seriously, both of them are built around pre sendmail 8.12 environments. there's some interesting stuff there, but it's now fairly dated, since sendmail 8.12 really changes the landscape. And all of those other environments.... > Or, you could just have Chuq solve this problem for you, as he > mentioned in > 006820.html>. ;-) gack. > >> So what can we do here to improve matters? > > Sounds to me like you want to externalize this whole process. Problem > is, bulk_mailer is the only tool Because pretty much every MLM has internalized the process. By the end of november, I'll have completely retired any use of bulk_mailer on my systems for other solutions. One big reason: increasing spam blocking (stupid or otherwise) of non-individually addressed email. The old list server setup of: to: subscribers of list bcc: bulk_drop@of.subscribes is increasingly risky as far as delivery is concerned. I also don't think it allows for the kind of personalization that's needed for your general audiences (help URLs, unsub URls, etc). And with sendmail 8.12, queue groups and envelope splitting, frankly, bulk_mailer does more harm to the delivery stream than good. Just stuff it into sendmail, tune sendmail to split intelligently. bulk_mailer is obsolete... and much to my amusement, a few sites block based on its use in headers (idiots), which is why my copy identifies itself as ulkbay_ailermay. From brad.knowles at skynet.be Thu Oct 30 12:20:56 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Thu Oct 30 12:22:34 2003 Subject: Efficient final message disposition (was Re: [Mailman-Developers] Requirements for a new archiver) In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> <1067489599.5295.71.camel@anthem> <1067525599.5295.126.camel@anthem> Message-ID: At 8:41 AM -0800 2003/10/30, Chuq Von Rospach wrote: > (yawn. My server's bored. snicker) Understood, but the techniques they recommend are still valid. > but seriously, both of them are built around pre sendmail 8.12 > environments. True. > there's some interesting stuff there, but it's now > fairly dated, since sendmail 8.12 really changes the landscape. > And all of those other environments.... There are still some things that even sendmail 8.12, postfix, etc... do not do. One of them is recipient sorting by average delivery time over the past week (probably want a decaying geometric mean), which would require tracking log data on a per-recipient basis. Another is two-level message handling, by configuring the MTA for the initial delivery attempt to use very low timeouts, but then to fall back to a secondary MTA (or MTA pool) that uses more standard timeouts for those sites that are slower. I'm sure there are others. > Because pretty much every MLM has internalized the process. Indeed. So, is Barry going the right way by trying to externalize this, or should the internal methods be beefed up so that they more fully address the issues in question? > And with sendmail 8.12, queue groups and envelope splitting, frankly, > bulk_mailer does more harm to the delivery stream than good. Just stuff > it into sendmail, tune sendmail to split intelligently. bulk_mailer is > obsolete... Perhaps in its current form, that is true. However, not all sites are using sendmail 8.12, and of the ones that are, most are probably not using it in a manner that is more suitable for mailing lists. So, this kind of tool does still have it's uses at most sites, and it could certainly be extended to address the issues that even the most modern MTAs do not (yet) attempt to handle. However, given the issues you've mentioned, it would probably be a good idea to be able to turn off selected "bulk_mailer" type features, so that you can let the MTA do more of it's job better -- if it is configured to do so. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From jam at jamux.com Thu Oct 30 16:20:10 2003 From: jam at jamux.com (John A. Martin) Date: Thu Oct 30 16:20:53 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <2267.1067480552@kanga.nu> (J. C. Lawrence's message of "Wed, 29 Oct 2003 21:22:32 -0500") References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> Message-ID: <87d6cectit.fsf@athene.jamux.com> >>>>> "claw" == J C Lawrence >>>>> "Re: [Mailman-Developers] Requirements for a new archiver " >>>>> Wed, 29 Oct 2003 21:22:32 -0500 claw> I may be unusual in this regard, but I generally consider claw> list archives as one-way systems: messages go in and never claw> come out. Out of idle curiosity, why doesn't 'write once read many' indicate a directory more than a database? jam -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 154 bytes Desc: not available Url : http://mail.python.org/pipermail/mailman-developers/attachments/20031030/4a729e60/attachment.bin From claw at kanga.nu Thu Oct 30 17:07:40 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 17:07:58 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from "John A. Martin" of "Thu, 30 Oct 2003 16:20:10 EST." <87d6cectit.fsf@athene.jamux.com> References: <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <20031029195420.GI24088@lenin.nu> <20031029203707.GK24088@lenin.nu> <20031029215906.GM24088@lenin.nu> <11079.1067465440@kanga.nu> <20031029222858.GN24088@lenin.nu> <2267.1067480552@kanga.nu> <87d6cectit.fsf@athene.jamux.com> Message-ID: <31521.1067551660@kanga.nu> On Thu, 30 Oct 2003 16:20:10 -0500 John A Martin wrote: > Out of idle curiosity, why doesn't 'write once read many' indicate a > directory more than a database? 1) The filesystem is a database. 2) Unix filesystems have extremely limited meta-data. 3) A discussed format is putting the mesasges on the filesystem (as a BD), and the meta data in a different DB (primarily due to open(2)/stat(2) expense. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From barry at python.org Thu Oct 30 17:47:18 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 30 17:47:26 2003 Subject: Rewriting Message-ID (was Re: [Mailman-Developers] Requirements for a new archiver) In-Reply-To: <4447.1067365823@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <17914.1067319714@kanga.nu> <1067363390.1235.40.camel@geddy> <4447.1067365823@kanga.nu> Message-ID: <1067554037.5295.191.camel@anthem> On Tue, 2003-10-28 at 13:30, J C Lawrence wrote: > Yup. Of course this heads directly into that beautiful debate of > whether MLMs should rewrite Message IDs. Summarising briefly: > > If we rewrite all IDs we'll piss off the people who use ID to do dupe > detection/deletion for courtesy copies. > > If we don't do some rewriting some messages won't make it through NNTP > and some other people will be pissed off. > > Two contrasting approaches: > > 1) We guarantee uniqueness of all Message IDs. The only way to do > this is to rewrite all IDs. This will piss off some people. > > 2) We best-effort guarantee uniqueness by only guaranteeing uniqueness > within the last N messages to the list. This could be one by > rewriting all IDs, in which case we might as well guarantee total > uniqueness, or it could be done by keeping a DB of the last N (cf > CDBD) and either discarding or rewriting detected collisions. This of > course means that some messages will be discarded by NNTP and we won't > know about it. Some may be willing to accept those risks. Nice summary, thanks. Here's a strawman: In the spirit of RFC 2369 we define a new header called List-Message-ID, and as in that standard, this field MUST only be generated by a mailing list, not by end users. Nested lists SHOULD remove the parent's List-Message-ID and supply its own. List-Message-ID conforms to the same syntax as for Message-ID in RFC 2822. Of course, for now read the header as if it had an X- prefix. When an MLM receives a message, it generates a List-Message-ID header which is guaranteed to be globally unique. A cooperating archiver should use this header as its primary key, and must provide a mechanism whereby the List-Message-ID can be presented and the archived message can be returned. It may fall back to Message-ID when there is no List-Message-ID header present. Internally, we use List-Message-ID as the primary key into our message store. We further define a header (X-)List-Archived-Message which contains a url pointing directly to this message in a cooperating archive. Now we have some knobs we can tweak. Q. When posting a message to News, when should Mailman copy the List-Message-ID header to Message-ID? A. Never, Only to resolve duplicate rejections, Always Q. When reflecting a posted message back to the list, when should Mailman copy the List-Message-ID header to Message-ID? A. Never, Always I think it's time we started filling in the missing holes in the RFCs for mailing list functions, such as the interactions we're describing here. I propose to start a section of the wiki (or perhaps www.list.org) to collect these. Eventually we should try to get consensus with or archivers and MLMs, and then push a standard, but that's a long way off. -Barry From barry at python.org Thu Oct 30 17:51:27 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 30 17:51:37 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <13832.1067452249@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <17914.1067319714@kanga.nu> <25767.1067446139@kanga.nu> <13832.1067452249@kanga.nu> Message-ID: <1067554287.5295.195.camel@anthem> On Wed, 2003-10-29 at 13:30, J C Lawrence wrote: > 2) Message IDs are not guaranteed globally unique, but the collision > rate can be manageable/acceptable in a large number of deployment > cases. Ah, which reminds me, elaborating on my strawman, the answers to "when should Mailman rewrite Message-ID on posts" should be: Never, Only to resolve duplicates, Always. -Barry From claw at kanga.nu Thu Oct 30 17:55:30 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 17:55:38 2003 Subject: Rewriting Message-ID (was Re: [Mailman-Developers] Requirements for a new archiver) In-Reply-To: Message from Barry Warsaw of "Thu, 30 Oct 2003 17:47:18 EST." <1067554037.5295.191.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <17914.1067319714@kanga.nu> <1067363390.1235.40.camel@geddy> <4447.1067365823@kanga.nu> <1067554037.5295.191.camel@anthem> Message-ID: <2815.1067554530@kanga.nu> On Thu, 30 Oct 2003 17:47:18 -0500 Barry Warsaw wrote: > In the spirit of RFC 2369 we define a new header called > List-Message-ID, and as in that standard, this field MUST only be > generated by a mailing list, not by end users. Nested lists SHOULD > remove the parent's List-Message-ID and supply its own. > List-Message-ID conforms to the same syntax as for Message-ID in RFC > 2822. Of course, for now read the header as if it had an X- prefix. > When an MLM receives a message, it generates a List-Message-ID header > which is guaranteed to be globally unique. A cooperating archiver > should use this header as its primary key, and must provide a > mechanism whereby the List-Message-ID can be presented and the > archived message can be returned. It may fall back to Message-ID when > there is no List-Message-ID header present. I haven't finished musing on this (busy day, thus slow on other replies as well), but my first thought: What happens when a given a message is sent to several lists on the same host? Does each list do its own munge? Do we do USENET-style crossposting? I want to do crossposting. I don't think we can due to per-list customisations. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Thu Oct 30 17:58:19 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 17:58:23 2003 Subject: [Mailman-Developers] Requirements for a new archiver In-Reply-To: Message from Barry Warsaw of "Thu, 30 Oct 2003 17:51:27 EST." <1067554287.5295.195.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <17914.1067319714@kanga.nu> <25767.1067446139@kanga.nu> <13832.1067452249@kanga.nu> <1067554287.5295.195.camel@anthem> Message-ID: <3009.1067554699@kanga.nu> On Thu, 30 Oct 2003 17:51:27 -0500 Barry Warsaw wrote: > On Wed, 2003-10-29 at 13:30, J C Lawrence wrote: >> 2) Message IDs are not guaranteed globally unique, but the collision >> rate can be manageable/acceptable in a large number of deployment >> cases. > Ah, which reminds me, elaborating on my strawman, the answers to "when > should Mailman rewrite Message-ID on posts" should be: Never, Only to > resolve duplicates, Always. Does that mean that we keep a database of all Message-IDs that all lists on that host have ever seen? If so, what happens when a single message is CC'ed to multiple lists? NetNews servers require global uniqueness across all newsgroups. I'm rapidly coming to the conclusion that we have to rewrite all Message-IDs whenever the internal archive is enabled. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From pioppo at ferrara.linux.it Thu Oct 30 18:45:32 2003 From: pioppo at ferrara.linux.it (Simone Piunno) Date: Thu Oct 30 18:34:15 2003 Subject: [Mailman-Developers] being flexible. In-Reply-To: <13955.1067487452@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067486474.5295.32.camel@anthem> <13955.1067487452@kanga.nu> Message-ID: <200310310045.32167.pioppo@ferrara.linux.it> On Thursday 30 October 2003 05:17, J C Lawrence wrote: > > ...as well as implement a bulk mailer to eliminate the need for an > > outgoing mail server. > > Eeeek! I trust this would be for immediate handoff to a "real" MTA > versus handling final delivery directly? Quite the Pandora's box if > not. I believe the best approach is to cover all options: - for test installations that work out of the box (people who don't care of performance, or don't care if 10% of messages gets refused because of broken MTAs out there) - for real production installations (where admins are smart/skilled and they know how to plug a real MTA, a real web server, and so on.) - for real installations on limited platforms (e.g. on a web server which doesn't support proxy rules but only has CGI) Naturally, the main underlying interface should be the "right" one, e.g. the one which gets the best stability and performance (plugging into a real MTA for sending/receiving email, proxy on a web server, etc.) and then we can just add some script to glue other solutions. Some example supposing we have built-in machinery for - direct web serving - smtp sender - pop3 poller A real installation will receive messages via pipe (like we do now) or LMTP or SMTP (from a real MTA fronting us) and will send them via smtpdirect to a real MTA (like we do now). The web interface will be served directly but fronted by a real web server in reverse proxy configuration. A real installation with a limited web server (no proxy rules) will be the same as above but there will be a proxy CGI that when invoked will connect to our internal web server (I know, this is slow, but CGI alone is slow anyway). A real installation heavily skinned will be the same as above but the web GUI will be built by a 3rd party talking with mailman over XMLRPC, just to exchange data and commands. A test installation (or a poor man's installation) will fetch messages from pop3 mailboxes (polling! I can hear you scream while you read this!) and send them directly to the internet (no real MTA involved) and will probably serve web pages directly, controlling port 80 (no real web server involved). Ideally there should be a wizard to choose among the available classes of installation, and then every other knob should be available TTW. -- Adde parvum parvo magnus acervus erit -- Ovidio From claw at kanga.nu Thu Oct 30 19:51:19 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 19:51:32 2003 Subject: [Mailman-Developers] Re: being flexible. In-Reply-To: Message from Simone Piunno of "Fri, 31 Oct 2003 00:45:32 +0100." <200310310045.32167.pioppo@ferrara.linux.it> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067486474.5295.32.camel@anthem> <13955.1067487452@kanga.nu> <200310310045.32167.pioppo@ferrara.linux.it> Message-ID: <11756.1067561479@kanga.nu> On Fri, 31 Oct 2003 00:45:32 +0100 Simone Piunno wrote: > On Thursday 30 October 2003 05:17, J C Lawrence wrote: >>> ...as well as implement a bulk mailer to eliminate the need for an >>> outgoing mail server. >> Eeeek! I trust this would be for immediate handoff to a "real" MTA >> versus handling final delivery directly? Quite the Pandora's box if >> not. > I believe the best approach is to cover all options: No. This is an one-size-doesn't-fit-anybody argument. > A test installation (or a poor man's installation) will fetch messages > from pop3 mailboxes... Hell no. Mailman is a conformant well behaved and very standard mail system, not a hack on top of a kludge that deliberately flouts the standards just because it wants to. > ... (polling! I can hear you scream while you read this!) and send > them directly to the internet (no real MTA involved) and will probably > serve web pages directly, controlling port 80 (no real web server > involved). Why? Even ignoring the abuse possibilities, what possible reason could we have for that time and effort investment when those problems are already far more competently and easily handled than we ever could, and there are so many other, more rewarding and demanding problems and features on the burner? -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Thu Oct 30 20:20:35 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 20:21:44 2003 Subject: Efficient final message disposition (was Re: [Mailman-Developers] Requirements for a new archiver) In-Reply-To: Message from Brad Knowles of "Thu, 30 Oct 2003 18:20:56 +0100." References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> <1067489599.5295.71.camel@anthem> <1067525599.5295.126.camel@anthem> Message-ID: <13903.1067563235@kanga.nu> On Thu, 30 Oct 2003 18:20:56 +0100 Brad Knowles wrote: > At 8:41 AM -0800 2003/10/30, Chuq Von Rospach wrote: > One of them is recipient sorting by average delivery time over the > past week (probably want a decaying geometric mean), which would > require tracking log data on a per-recipient basis. While I don't disagree, this is really an MTA's job, not Mailman's. This is why I've been doing log analysis of MXes and routing mail to customised outbound MTAs on the basis of responsiveness, since early 2000. Adaptive MX routing is great stuff. > Another is two-level message handling, by configuring the MTA for the > initial delivery attempt to use very low timeouts, but then to fall > back to a secondary MTA (or MTA pool) that uses more standard timeouts > for those sites that are slower. Yup. I did it at the first level with an initial SMTP proxy which routed based on MX response records pulled from a DB. > Perhaps in its current form, that is true. However, not all sites are > using sendmail 8.12, and of the ones that are, most are probably not > using it in a manner that is more suitable for mailing lists. I'm generally of the view that Mailman should do opportunistic domain sorting and per-MTA customised VERP handoffs (because nobody has standardised VERP across MTAs), and beyond that to back off. Mailman's job is to get the outbound mail into the MTA's spool as quickly as possible, wrapped in transactions (ie RCPT TO bundles) that are friendly to efficient processing, and that's it. We're not in the game of second guessing the MTAs. That way lies wasted time and madness. > However, given the issues you've mentioned, it would probably be a > good idea to be able to turn off selected "bulk_mailer" type features, > so that you can let the MTA do more of it's job better -- if it is > configured to do so. There are thresholds for covering up for broken software. There are also thresholds for covering up for SysAdm negligence or oversight. You've got to pick where you stop accepting the problem. Ideally we should be resilient and friendly to both. Realistically we need to do something reasonable and not worry too hard about the rest. Priorities. Mailman's primary performance problems are not at the MTA hand off. MTA configuration and tuning for mailing lists is only a minor art. There is not-inconsiderable documentation and understanding of the field. A US$2K commodity box subjected to moderate tuning efforts using readily available documentation can sustain 2,400 outbound deliveries per minute. You do the arithmetic. In a perfect world that maps out to 3.4 million per day. Cut that under half for queue injection overhead other crap and you're still talking a million deliveries per day for a US$2K host.[1] A million messages a day already puts us above the 99th percentile for list server audiences. I'm not really concerned about that problem. Where Mailman's performance hurts is in the handling of the list configs, especially for lists with very large memberships rosters and in queue runner performance and overhead (try watching queue runner's system resource profile in v2.1 for lists with > 50,000 members). For me those are the obvious low hanging fruit, and those are the points that will help not just the performance hounds, but also the lower 80% who are running under-provisioned under-configured under-admined multi-purpose boxes who want Mailman to be a bit more reasonable and forgiving about their not-so-brilliant systems. [1] That's of course assuming reasonable sustained queue size and responsive MXes. However, those are separate problems and ignoring MTA-specific behaviours (like Exim's active hatred of large queues), the methods and systems to segment and tame those problems are fairly well known. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Thu Oct 30 20:26:14 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 20:26:24 2003 Subject: Efficient final message disposition (was Re: [Mailman-Developers] Requirements for a new archiver) In-Reply-To: Message from Chuq Von Rospach of "Thu, 30 Oct 2003 08:41:17 PST." References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> <1067489599.5295.71.camel@anthem> <1067525599.5295.126.camel@anthem> Message-ID: <14404.1067563574@kanga.nu> On Thu, 30 Oct 2003 08:41:17 -0800 Chuq Von Rospach wrote: > On Oct 30, 2003, at 7:48 AM, Brad Knowles wrote: > One big reason: increasing spam blocking (stupid or otherwise) of > non-individually addressed email. The old list server setup of: > to: subscribers of list > bcc: bulk_drop@of.subscribes > is increasingly risky as far as delivery is concerned. I've seen a couple mail BCPs and internal spam-handling plans at large ISPs and corporates which explicitly include the line item: Discard all mail with more than one address in the envelope. Scary, stupid, true: They want the pain to stop. I find it hard to blame them. > I also don't think it allows for the kind of personalization that's > needed for your general audiences (help URLs, unsub URls, etc). Aye, such VERPish attributes is becoming a necessity. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Thu Oct 30 21:28:29 2003 From: claw at kanga.nu (J C Lawrence) Date: Thu Oct 30 21:28:40 2003 Subject: Efficient final message disposition (was Re: [Mailman-Developers] Requirements for a new archiver) In-Reply-To: Message from Barry Warsaw of "Thu, 30 Oct 2003 09:53:19 EST." <1067525599.5295.126.camel@anthem> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> <1067489599.5295.71.camel@anthem> <1067525599.5295.126.camel@anthem> Message-ID: <19058.1067567309@kanga.nu> On Thu, 30 Oct 2003 09:53:19 -0500 Barry Warsaw wrote: > - The desire/requirement that Mailman chunk and sort recipients This shouldn't be any more complex than domain sorting, and need not be perfect. > - The ability for Mailman to swamp the mail server or cause the mail > server to consume all available cpu Rate limiting. > - The fact that failures in upstream mail server are reported to > Mailman as bounces instead of as error codes I don't know that Mailman can do anything about this. We can't reliably distinguish between system errors and delivery failures for MTAs beyond Mailman's borders. There's a protocol hole here I don't know we can or should attempt to fix. > - Inefficiencies in VERP/personalization/mail-merge because of the > lack of cooperation Oh yeah. > - The need for Mailman to queue outgoing messages that aren't > completely delivered Queue runner could do with some more intelligence in that dept. > Mailman wants to simply hand that data off to some agent and forget > about it. It wants to know that the agent will make best effort to > mail merge and deliver. It wants to be informed of any final delivery > failures. And that's it. Mailman doesn't want to chunkify > recipients, and it doesn't want to sort them. It doesn't want to > worry about a mail server effectively managing system resources. I'd > rather not have to hand it a couple of meg of recipient or > substitution data, but there seems to be no other way. > So what can we do here to improve matters? Start yelling at DJB, Wietse, Phillip, and Eric about a standardised SMTP extension for VERP. With a little luck and minor work we can probably get some of the other commercial mail people involved as well. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From pioppo at ferrara.linux.it Fri Oct 31 04:21:20 2003 From: pioppo at ferrara.linux.it (Simone Piunno) Date: Fri Oct 31 04:21:39 2003 Subject: [Mailman-Developers] Re: being flexible. In-Reply-To: <11756.1067561479@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <200310310045.32167.pioppo@ferrara.linux.it> <11756.1067561479@kanga.nu> Message-ID: <200310311021.20290.pioppo@ferrara.linux.it> Alle 01:51, venerd? 31 ottobre 2003, J C Lawrence ha scritto: > > A test installation (or a poor man's installation) will fetch messages > > from pop3 mailboxes... > > Hell no. Mailman is a conformant well behaved and very standard mail > system, not a hack on top of a kludge that deliberately flouts the > standards just because it wants to. Hehe, I knew you'd have screamed :) Try to think it the other way: a test installation with pop3 is an hack on top of a solid conformant well behaved and very standard mail system. Skilled people can do it right now with Mailman 2.1: just configure fetchmail to act as an MTA and pipe messages to Mailman. The only difference I proposed is that non-skilled people should be able to do this... BTW we already use lynx for html->text conversion, and we already prepare aliases and virtual for postfix, so we could also make Mailman able to generate a fetchmail.conf file. I'm not asking to make really heavy hacks like using a single email address to carry all the incoming traffic (-bounce, -join, -leave and so on). Nowadays many people have mailboxes gathering all the traffic for many addresses, and it's very common for people to have web access to the configuration of all the mailboxes for a domain they own but, is managed on a 3rd party mail server they can't install Mailman. > > serve web pages directly, controlling port 80 (no real web server > > involved). > > Why? Even ignoring the abuse possibilities, what possible reason could > we have for that time and effort investment when those problems are > already far more competently and easily handled than we ever could, and > there are so many other, more rewarding and demanding problems and > features on the burner? We all know CGI is sub-optimal. We're also planning to stop vending archives directly from disk, increasing the CGI load. mod_python or a web runner would perform better. I feel the reverse-proxy configuration (similar to zope) would be the better choice. And yes, I know that there are web servers unable to proxy requests on a backend server, but this is easily solved by a small CGI just proxying requests (we already do something very similar for incoming email, even if we do it for different reasons - e.g. to use SGID for security) So *if* we accept this solution, we already have an HTTP interface and we get direct HTTP (for tests) for free. Whenever you have a problem in the proxied configuration, you absolutely want to make direct requests to the backend, at least to determine where the problem is (in the web server or in Mailman itself?). -- This signature intentionally left blank From brad.knowles at skynet.be Fri Oct 31 10:04:43 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Fri Oct 31 10:06:43 2003 Subject: Efficient final message disposition (was Re: [Mailman-Developers] Requirements for a new archiver) In-Reply-To: <13903.1067563235@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> <1067489599.5295.71.camel@anthem> <1067525599.5295.126.camel@anthem> <13903.1067563235@kanga.nu> Message-ID: At 8:20 PM -0500 2003/10/30, J C Lawrence wrote: > While I don't disagree, this is really an MTA's job, not Mailman's. > This is why I've been doing log analysis of MXes and routing mail to > customised outbound MTAs on the basis of responsiveness, since early > 2000. Adaptive MX routing is great stuff. There is a need for this function, and no MTA available today does it. MLMs throughout the history of the Internet have incorporated a variety of features for SMTP performance enhancement that are unique to mailing lists or are usually found primarily in mailing lists, and this is no different. If you want to externalize all these functions outside of mailman, that's fine. But then someone has to pick up the ball and start hacking on bulk_mailer or some other program to provide these features. > Yup. I did it at the first level with an initial SMTP proxy which > routed based on MX response records pulled from a DB. Again, this is a feature which is not found on any MTA available today, and which is known to have a huge impact on mailing list performance. This feature needs to be provided somewhere, by someone. > I'm generally of the view that Mailman should do opportunistic domain > sorting and per-MTA customised VERP handoffs (because nobody has > standardised VERP across MTAs), and beyond that to back off. Mailman's > job is to get the outbound mail into the MTA's spool as quickly as > possible, wrapped in transactions (ie RCPT TO bundles) that are friendly > to efficient processing, and that's it. If you go back to Barry's message, he was talking about getting even further involved, by doing a mail-merge process. Since there is no MMTP (something that Bryan Costales, Eric Allman, and I had worked on for a while, before we realized that it would just make the spam problem worse and then dropped all further efforts), there is a need for an intermediate program that is called by mailman and then hands the messages off to the MTA. Either that intermediate program can be provided by mailman itself, or it can come from a third party. But it needs to come from somewhere. > We're not in the game of second guessing the MTAs. That way lies wasted > time and madness. If there were MLTAs which were optimized for this function, I would agree with you. Since we're trying to take standard MTAs which may have only some optimizations that might be generally applicable to most situations (including mailing lists), I must disagree. For the mailing list specific optimizations that we know are not provided by many common MTAs or MTA versions, we need to perform those optimizations before the message gets to the MTA. We also need to be able to selectively turn them off, in the case that there are MTAs that can do that specific job themselves and don't need our interference. > Where Mailman's performance hurts is in the handling of the list > configs, especially for lists with very large memberships rosters and in > queue runner performance and overhead (try watching queue runner's > system resource profile in v2.1 for lists with > 50,000 members). For > me those are the obvious low hanging fruit, You should definitely go after the low-hanging fruit when you can. However, you also have to consider how much work would go into fixing those problems. A high priority item that would require re-engineering the entire system is something that should be planned for the long term, perhaps in conjunction with other things that would likewise require significant re-engineering efforts as well. Meanwhile, if there are other performance issues that can be addressed which do not require such significant re-engineering, those should be given serious consideration in the shorter term. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From claw at kanga.nu Fri Oct 31 14:17:07 2003 From: claw at kanga.nu (J C Lawrence) Date: Fri Oct 31 14:17:15 2003 Subject: [Mailman-Developers] Re: being flexible. In-Reply-To: Message from Simone Piunno of "Fri, 31 Oct 2003 10:21:20 +0100." <200310311021.20290.pioppo@ferrara.linux.it> References: <3F9D08F9.6090209@student.umist.ac.uk> <200310310045.32167.pioppo@ferrara.linux.it> <11756.1067561479@kanga.nu> <200310311021.20290.pioppo@ferrara.linux.it> Message-ID: <27645.1067627827@kanga.nu> On Fri, 31 Oct 2003 10:21:20 +0100 Simone Piunno wrote: > Alle 01:51, venerd́ 31 ottobre 2003, J C Lawrence ha scritto: >>> A test installation (or a poor man's installation) will fetch >>> messages from pop3 mailboxes... >> Hell no. Mailman is a conformant well behaved and very standard mail >> system, not a hack on top of a kludge that deliberately flouts the >> standards just because it wants to. > Hehe, I knew you'd have screamed :) Aye, its a lot of fragile and unnecessary work that encourages a use of mailman which I'd rather never occurred. > Try to think it the other way: a test installation with pop3 is an > hack on top of a solid conformant well behaved and very standard mail > system. Right, but the information lost across the non-SMTP translation is irretrievable and necessary. > Skilled people can do it right now with Mailman 2.1: just configure > fetchmail to act as an MTA and pipe messages to Mailman. The only > difference I proposed is that non-skilled people should be able to do > this... I account this one of those areas where the people who know how to do it, also have some hope of knowing better than to. >> Why? Even ignoring the abuse possibilities, what possible reason >> could we have for that time and effort investment when those problems >> are already far more competently and easily handled than we ever >> could, and there are so many other, more rewarding and demanding >> problems and features on the burner? > We all know CGI is sub-optimal. We're also planning to stop vending > archives directly from disk, increasing the CGI load. True, however the processing delta can be kept quite small. > mod_python or a web runner would perform better. The problem there is that they are insufficiently portable. > I feel the reverse-proxy configuration (similar to zope) would be the > better choice. Why would it be better? What are the disadvantages of those benefits? What specifically useful or necessary function would it add which is not being served today? What interesting and valuable deployment cases would it prevent or make more difficult? Which would it open to the product that are not current addressed? Let's get a cost and value statement going. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From claw at kanga.nu Fri Oct 31 14:27:56 2003 From: claw at kanga.nu (J C Lawrence) Date: Fri Oct 31 14:28:03 2003 Subject: Efficient final message disposition (was Re: [Mailman-Developers] Requirements for a new archiver) In-Reply-To: Message from Brad Knowles of "Fri, 31 Oct 2003 16:04:43 +0100." References: <3F9D08F9.6090209@student.umist.ac.uk> <1067285190.16085.79.camel@localhost.localdomain> <87d6cfdhjr.fsf@athene.jamux.com> <7B342705-0A47-11D8-A2E1-0003934516A8@plaidworks.com> <3147.1067460073@kanga.nu> <664305B3-0A54-11D8-BABB-0003934516A8@plaidworks.com> <24751.1067464496@kanga.nu> <1067486474.5295.32.camel@anthem> <949F2B32-0A92-11D8-A49B-0003934516A8@plaidworks.com> <1067489599.5295.71.camel@anthem> <1067525599.5295.126.camel@anthem> <13903.1067563235@kanga.nu> Message-ID: <28451.1067628476@kanga.nu> On Fri, 31 Oct 2003 16:04:43 +0100 Brad Knowles wrote: > At 8:20 PM -0500 2003/10/30, J C Lawrence wrote: >> While I don't disagree, this is really an MTA's job, not Mailman's. >> This is why I've been doing log analysis of MXes and routing mail to >> customised outbound MTAs on the basis of responsiveness, since early >> 2000. Adaptive MX routing is great stuff. > There is a need for this function, and no MTA available today does it. > MLMs throughout the history of the Internet have incorporated a > variety of features for SMTP performance enhancement that are unique > to mailing lists or are usually found primarily in mailing lists, and > this is no different. True. Its not a very difficult process, and is absurdly expensive the way I handle it. At some point in my copious spare time I should whack another couple config tokens into Exim, just to up the ante. > If you want to externalize all these functions outside of mailman, > that's fine. But then someone has to pick up the ball and start > hacking on bulk_mailer or some other program to provide these > features. Aye, but some care should be taken here defining who the people are, between the Good-For-Mailman, and Good-For-Large-Mail-Systems camps. They're related, but not synonymous. >> Yup. I did it at the first level with an initial SMTP proxy which >> routed based on MX response records pulled from a DB. > Again, this is a feature which is not found on any MTA available > today, and which is known to have a huge impact on mailing list > performance. This feature needs to be provided somewhere, by someone. True. > If you go back to Barry's message, he was talking about getting > even further involved, by doing a mail-merge process. Since there is > no MMTP (something that Bryan Costales, Eric Allman, and I had worked > on for a while, before we realized that it would just make the spam > problem worse and then dropped all further efforts), there is a need > for an intermediate program that is called by mailman and then hands > the messages off to the MTA. Mailmerge and VERP customisation, and the standards for the communication of those things to the MTA are areas that need attention, both for Mailman and the rest of the market (tho the IronPort and related guys might argue). This would be a good point to get some cross-MTA discussion going on. >> We're not in the game of second guessing the MTAs. That way lies >> wasted time and madness. > If there were MLTAs which were optimized for this function, IIRC QMail has a (typically DJB) VERP/rewrite handoff method. I also recall that it is very bound into QMail's process and IO model, but perhaps this should be examined? > I would agree with you. Since we're trying to take standard MTAs > which may have only some optimizations that might be generally > applicable to most situations (including mailing lists), I must > disagree. There's that audience problem again. I actually agree with you in the general case, and am willing to spend time and effort in that direction. However I see this as somewhat disjoint from Mailman in specific. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From pioppo at ferrara.linux.it Fri Oct 31 16:04:50 2003 From: pioppo at ferrara.linux.it (Simone Piunno) Date: Fri Oct 31 15:52:01 2003 Subject: [Mailman-Developers] Re: being flexible. In-Reply-To: <27645.1067627827@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <200310311021.20290.pioppo@ferrara.linux.it> <27645.1067627827@kanga.nu> Message-ID: <200310312204.50548.pioppo@ferrara.linux.it> On Friday 31 October 2003 20:17, J C Lawrence wrote: > Right, but the information lost across the non-SMTP translation is > irretrievable and necessary. sorry, I don't get it. Right now we're receiving messages over a pipe and as far as I see we're not using any environment or command line parameter (beside the script name) so where's this SMTP magic? Fetchmail would pass exactly the same bytestream over the same pipe, or do I miss something? > > We all know CGI is sub-optimal. We're also planning to stop vending > > archives directly from disk, increasing the CGI load. > > True, however the processing delta can be kept quite small. This is not only a CPU load problem... this is also a lock contention (over the list configuration) problem. > > mod_python or a web runner would perform better. > > The problem there is that they are insufficiently portable. I already explained how this can be solved for a web runner, please argument on that proposal. > > I feel the reverse-proxy configuration (similar to zope) would be the > > better choice. > > Why would it be better? first of all we absolutely need to avoid the CGI to load, modify and save the list config. There's a big lock contention and it's a PITA to code. This would be true even for a transactional berkeley DB or similar setup. Remember that we want to configure and administer TTW also the spam filter. Right now in a 200 subscribers list I have: [pioppo@liston flug]$ ll *.pck -rw-rw---- 1 mailman mailman 56639 Oct 31 19:59 config.pck -rw-r--r-- 1 mailman mailman 378634 Oct 31 19:59 spam.pck Can we afford to re-load such a beast for each HTTP request on a list with one million users and/or some megabytes of spam data? Then we get a cleaner code (we won't have m.Load() and m.Save() scattered all over the map) Finally, we completely avoid the SGID driver. > What are the disadvantages of those benefits? I see none. > What specifically useful or necessary function would it add which is not > being served today? we'll be able to centralize the configuration data all in one place (instead of one .pck per list, which we're pretty much forced now due to the CGI reload .pck everytime). This is good: - to move toward a user-centric Mailman - to easy system administration (backups, etc.) - to build a site-admin web panel Furthermore, Mailman could run on a different machine, not the one running the web server. > What interesting and valuable deployment cases > would it prevent or make more difficult? On a web server without support for reverse proxy rules, one will need to configure and activate the glue CGI. It shouldn't be more difficult than it is now (we already have a glue CGI). I know, this CGI could talk with the running Mailman using something different from HTTP, so we could do everything I described without a built-in HTTP server. We could use a some-different-protocol server, but do you know a better protocol? > Which would it open to the product that are not current addressed? sorry? -- Adde parvum parvo magnus acervus erit -- Ovidio From brad.knowles at skynet.be Fri Oct 31 16:30:16 2003 From: brad.knowles at skynet.be (Brad Knowles) Date: Fri Oct 31 16:33:18 2003 Subject: [Mailman-Developers] Re: being flexible. In-Reply-To: <200310312204.50548.pioppo@ferrara.linux.it> References: <3F9D08F9.6090209@student.umist.ac.uk> <200310311021.20290.pioppo@ferrara.linux.it> <27645.1067627827@kanga.nu> <200310312204.50548.pioppo@ferrara.linux.it> Message-ID: At 10:04 PM +0100 2003/10/31, Simone Piunno wrote: > Right now we're receiving messages over a pipe and as far as I see we're not > using any environment or command line parameter (beside the script name) so > where's this SMTP magic? You mean the envelope sender address? The IP address of the envelope sender? > Fetchmail would pass exactly the same bytestream over the same pipe, or do I > miss something? Yup. Fetchmail loses exactly the same information, in exactly the same way. As soon as the message is written to a mailbox somewhere, you lose the envelope information. So, retrieving the mailbox via POP3 doesn't work. -- Brad Knowles, "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) From pioppo at ferrara.linux.it Fri Oct 31 17:26:08 2003 From: pioppo at ferrara.linux.it (Simone Piunno) Date: Fri Oct 31 17:12:47 2003 Subject: [Mailman-Developers] Re: being flexible. In-Reply-To: References: <3F9D08F9.6090209@student.umist.ac.uk> <200310312204.50548.pioppo@ferrara.linux.it> Message-ID: <200310312326.08681.pioppo@ferrara.linux.it> On Friday 31 October 2003 22:30, Brad Knowles wrote: > > Right now we're receiving messages over a pipe and as far as I see we're > > not using any environment or command line parameter (beside the script > > name) so where's this SMTP magic? > > You mean the envelope sender address? The IP address of the > envelope sender? yes, I mean exactly this, we're not retrieving them, neither by command line options nor by environment variables. Doing it would probably be MTA-specific (i.e. less portable). I want also to point you to the USE_ENVELOPE_SENDER parameter in Defaults.py and to the get_sender method in Mailman/Message.py, where this parameter is only used to decide in which priority order we read "From:" and "Sender:" from the message header. > Yup. Fetchmail loses exactly the same information, in exactly > the same way. As soon as the message is written to a mailbox > somewhere, you lose the envelope information. So, retrieving the > mailbox via POP3 doesn't work. I perfectly know what you mean, the fact is that over the pipe we get exactly a mailbox format (but without the leading "From " line) so according to you Mailman 2.1 doesn't work either! What are you trying to demonstrate? -- Adde parvum parvo magnus acervus erit -- Ovidio From claw at kanga.nu Fri Oct 31 18:08:24 2003 From: claw at kanga.nu (J C Lawrence) Date: Fri Oct 31 18:08:34 2003 Subject: [Mailman-Developers] Re: being flexible. In-Reply-To: Message from Simone Piunno of "Fri, 31 Oct 2003 23:26:08 +0100." <200310312326.08681.pioppo@ferrara.linux.it> References: <3F9D08F9.6090209@student.umist.ac.uk> <200310312204.50548.pioppo@ferrara.linux.it> <200310312326.08681.pioppo@ferrara.linux.it> Message-ID: <1736.1067641704@kanga.nu> On Fri, 31 Oct 2003 23:26:08 +0100 Simone Piunno wrote: > On Friday 31 October 2003 22:30, Brad Knowles wrote: >>> Right now we're receiving messages over a pipe and as far as I see >>> we're not using any environment or command line parameter (beside >>> the script name) so where's this SMTP magic? >> You mean the envelope sender address? The IP address of the envelope >> sender? > yes, I mean exactly this, we're not retrieving them, neither by > command line options nor by environment variables. Ahh, yes. I'd forgotten that as I wrap Mailman's wrapper in a larger filter which does take and processes the envelope information. However, there remains a distinct difference if several addresses are conflated to the same POP/IMAP box. In that case you lose the target side of the envelope, and thus can't determine reliably which Mailman-specific address/wrapper/alias to deliver the message to. > Doing it would probably be MTA-specific (i.e. less portable). As of last I checked the Debian Exim packages ship with the no-alias-file-required Mailman configs by default. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. From Martin at Cleaver.org Tue Oct 21 12:05:29 2003 From: Martin at Cleaver.org (Martin@Cleaver.org) Date: Sun Nov 2 11:43:22 2003 Subject: [Mailman-Developers] RE: [Mailman-Users] Mailman & CPanel? In-Reply-To: <1064277249.5049.34.camel@Anncons4> Message-ID: Can we get a mailing list set up to address and escalate mailman <-> cpanel issues? I am particularly concerned about http://forums.cpanel.net/showthread.php?s=&threadid=15928 but from the mailman-users@python.org list there are definitely a whole host of issues that the mailman-developers would address if we had the correct communication in place. Thanks, Martin -- Martin@Cleaver.org - +1 416 832 7759 Melbourne Business School FT 2004 MBA Exchange Participant to Rotman =-----Original Message----- =From: =mailman-users-bounces+martin.cleaver=bcs.org.uk@python.org =[mailto:mailman-users-bounces+martin.cleaver=bcs.org.uk@python. =org] On Behalf Of Jon Carnes =Sent: Monday, 22 September 2003 8:34 PM =To: Barry Warsaw =Cc: mailman-developers@python.org; mailman-users@python.org; =Chuq Von Rospach =Subject: Re: [Mailman-Users] Mailman & CPanel? = = =On Mon, 2003-09-22 at 19:09, Barry Warsaw wrote: => On Mon, 2003-09-22 at 18:49, Chuq Von Rospach wrote: => => > > Does anybody know anything about this? => > => > I've seen people asking about it on ListOwners. I've kept my head => > down => > because I didn't know what you'd said/done, but it sort of =sounds like => > it's either the cPanel folks trying to blame Mailman for its own => > problems, or an ISP trying to blame cPanel and/or mailman. =I know we've => > never explicitly supported cPanel in the mailman projects. =it seems => > like cPanel is the crew to bring this up with, not us. => => Ah, thanks for the info Chuq. I think you've got the right take on => things, except that I'd add this: if the cPanel developers have => specific issues with Mailman that are causing them problems, I'd be => happy to hear about them, and if it makes sense, to address them. I => also wouldn't turn down their money . => => Other than that, you're right, we really don't have any relationship => with them, which is different than saying we're actively not => supporting it anymore (not that we did in the first place :). => => -Barry => = =We've seen a lot of CPanel folks coming here for help. I =basically delete those, since we can never help them. But a =few times folks on the list have expressed that we can't help =them; that could easily be interpreted as "Mailman won't =support CPanel installs". = =One frustrating item that is expressed periodically is that =CPanel won't open their source up. It would be nice if they =would provide some sort of API or if they would release the =source of their "Mailman widget" that controls Mailman - then =we *could* help them. = =I'm very surprised that no one from CPanel has approached =Barry or the developers-list yet. There is a lot broken with =CPanel that could be made much better with some shared ideas. = =Jon Carnes = = =------------------------------------------------------ =Mailman-Users mailing list =Mailman-Users@python.org =http://mail.python.org/mailman/listinfo/mailman=-users =Mailman =FAQ: http://www.python.org/cgi-bin/faqw-mm.py =Searchable Archives: =http://www.mail-archive.com/mailman-users%=40python.org/ = =This =message was sent to: Martin.Cleaver@bcs.org.uk =Unsubscribe or change your options at =http://mail.python.org/mailman/options/mailman-=users/martin.cle =aver%40bcs.org.uk = From kv at ltk.net.ua Wed Oct 22 09:55:29 2003 From: kv at ltk.net.ua (Vitaliy Karlov) Date: Sun Nov 2 11:43:32 2003 Subject: [Mailman-Developers] Help with bugs Message-ID: <20031022135529.GA97191@pepper.ltk.net.ua> Good day all. Some days ago I notice than some post to list did not go to list, but stay shuting. in logs/error: this === Oct 22 16:44:21 2003 (38679) SHUNTING: 1066822120.244683+5ac80f2a2c5793309cfbcbed703615ccf31b365b Oct 22 16:44:21 2003 (38679) Uncaught runner exception: 'ascii' codec can't decode byte 0xce in position 1: ordinal not in range(128) Oct 22 16:44:21 2003 (38679) Traceback (most recent call last): File "/usr/local/mailman/Mailman/Queue/Runner.py", line 110, in _oneloop self._onefile(msg, msgdata) File "/usr/local/mailman/Mailman/Queue/Runner.py", line 160, in _onefile keepqueued = self._dispose(mlist, msg, msgdata) File "/usr/local/mailman/Mailman/Queue/ArchRunner.py", line 73, in _dispose mlist.ArchiveMail(msg) File "/usr/local/mailman/Mailman/Archiver/Archiver.py", line 208, in ArchiveMail h.processUnixMailbox(f) File "/usr/local/mailman/Mailman/Archiver/pipermail.py", line 560, in processUnixMailbox self.add_article(a) File "/usr/local/mailman/Mailman/Archiver/pipermail.py", line 592, in add_article temp = self.format_article(article) File "/usr/local/mailman/Mailman/Archiver/HyperArch.py", line 1225, in format_article self.__processbody_URLquote(lines) File "/usr/local/mailman/Mailman/Archiver/HyperArch.py", line 1164, in __processbody_URLquote text = re.sub('@', _(' at '), text) File "/usr/local/lib/python2.3/sre.py", line 143, in sub return _compile(pattern, 0).sub(repl, string, count) UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 1: ordinal not in range(128) === or this === Oct 22 16:44:20 2003 (38682) SHUNTING: 1066819196.75235+30e4de05fe9d8ee5948c26cd52616945c651f4cb Oct 22 16:44:20 2003 (38682) Uncaught runner exception: Oct 22 16:44:20 2003 (38682) Traceback (most recent call last): File "/usr/local/mailman/Mailman/Queue/Runner.py", line 110, in _oneloop self._onefile(msg, msgdata) File "/usr/local/mailman/Mailman/Queue/Runner.py", line 160, in _onefile keepqueued = self._dispose(mlist, msg, msgdata) File "/usr/local/mailman/Mailman/Queue/IncomingRunner.py", line 130, in _dispose more = self._dopipeline(mlist, msg, msgdata, pipeline) File "/usr/local/mailman/Mailman/Queue/IncomingRunner.py", line 153, in _dopipeline sys.modules[modname].process(mlist, msg, msgdata) File "/usr/local/mailman/Mailman/Handlers/CookHeaders.py", line 75, in process prefix_subject(mlist, msg, msgdata) File "/usr/local/mailman/Mailman/Handlers/CookHeaders.py", line 238, in prefix_subject headerbits = decode_header(subject) File "/usr/local/mailman/pythonlib/email/Header.py", line 113, in decode_header raise HeaderParseError HeaderParseError == I thanks for any help. My knowledge in language python is poor. -- WBR, Vitaliy Karlov [KV1670-RIPE] From Markus.Zuelch at materna.de Wed Oct 29 05:07:07 2003 From: Markus.Zuelch at materna.de (Markus.Zuelch@materna.de) Date: Sun Nov 2 11:43:38 2003 Subject: [Mailman-Developers] How do I got the email addresses? Message-ID: Hallo, I need a way to store (automaticaly, on the fly) the incomming new Subscriber email-address into an Oracle Database. Also I need a way to tell Mailman (Version 2.1.3) not to collect the email-adresses from its own database, but rather from the Oracle Database BEFORE Mailman sends the email to the subscribed recipients. Can anyone tell me, which modules are responsible for handling the Subscriber and recipients email-addresses? Thank you very much Markus From chris at webarchitects.co.uk Thu Oct 30 06:29:49 2003 From: chris at webarchitects.co.uk (Chris Croome) Date: Sun Nov 2 11:43:42 2003 Subject: Persistent message URIs and a mid redirector, was: Re: [Mailman-Developers] Requirements for a new archiver In-Reply-To: <5479.1067466799@kanga.nu> References: <3F9D08F9.6090209@student.umist.ac.uk> <20031027230854.A83988@gewis.win.tue.nl> <1067293730.1785.96.camel@anthem> <17914.1067319714@kanga.nu> <25767.1067446139@kanga.nu> <13832.1067452249@kanga.nu> <5479.1067466799@kanga.nu> Message-ID: <20031030112949.GA424@webarchitects.co.uk> Hi Appols to butt in without having had the time to properly follow the thread... For me the thing that I hate most about the current mailman web archives is the lack of persistent URIs, the fact that you open a mbox to edit out soemones phone number they sent to a public list by mistake and after you have rebuild the archives most message URIs have changed and as a result dozens of carfully constructed wiki pages referencing these email are broken :-( If a new archive only resulted in persistent URIs for messages I'd be happy, I guess peole know this classic? Cool URIs don't change http://www.w3.org/Provider/Style/URI Also are people aware of the neat hack that the W3C uses where by you can get to any message in their list archives with a URI like this: http://www.w3.org/mid/$MID And in addition each outgoing message from their list server has this URI in the header, for example: X-Archived-At: http://www.w3.org/mid/20030204091000.GC2284@webarchitects.co.uk This header is added with a procmail rule: http://groups.yahoo.com/group/rss-dev/message/3163 Chris -- Chris Croome web design http://www.webarchitects.co.uk/ web content management http://mkdoc.com/ From sdhill at metasystema.net Fri Oct 31 18:35:18 2003 From: sdhill at metasystema.net (Simon Hill) Date: Sun Nov 2 11:43:46 2003 Subject: [Mailman-Developers] URL change for Reply-To Munging Considered Useful Message-ID: Hello, Please update the URL for my essay "Reply-To Munging Considered Useful" in the help item for setting reply-to. The new URL is: http://www.metasystema.net/essays/reply-to.mhtml Note that the new URL is at metasystema.net rather than metasystema.org. There is no hurry, as metasystema.org will redirect to metasystema.net for another year. However, after that metasystema.org will be retired, so the sooner the change is made, the less users will get a dead link. I'd like to take the opportunity to thank the developers of Mailman for their efforts. I admin several lists and Mailman is my mailing list manager of choice. I'd also like to thank you all for including a link to my essay in Mailman - and an especial thanks to those of you who disagree with my position but decided to include it anyway. Best regards, Simon Hill From pchamorro at ingeomin.gov.co Sat Oct 25 01:22:37 2003 From: pchamorro at ingeomin.gov.co (Pablo Chamorro C.) Date: Sun Nov 2 20:44:51 2003 Subject: [Mailman-Developers] serious web authentication problem Message-ID: I just created a new list and I suscribed some users, but when a user changes his password Mailman shows this message: Bug in Mailman version 2.1.3 We're sorry, we hit a bug! If you would like to help us identify the problem, please email a copy of this page to the webmaster for this site with a description of what happened. Thanks! Traceback: Traceback (most recent call last): File "/var/mailman/scripts/driver", line 87, in run_main main() File "/var/mailman/Mailman/Cgi/private.py", line 120, in main password, username): File "/var/mailman/Mailman/SecurityManager.py", line 219, in WebAuthenticate print self.MakeCookie(ac, user) File "/var/mailman/Mailman/SecurityManager.py", line 228, in MakeCookie raise ValueError ValueError I could reproduce this error in other server and with others lists, users and with any browser including lynx. I was trying under RH 9.0. Please help me or give me some idea... I've has been setting up and tunning Mailman (mhonarc, namazu, translating templates and messages) but this bug makes me very sad. Pablo Chamorro C. -- From pchamorro at ingeomin.gov.co Tue Oct 28 00:22:20 2003 From: pchamorro at ingeomin.gov.co (Pablo Chamorro C.) Date: Sun Nov 2 20:45:02 2003 Subject: [Mailman-Developers] two bugs in Mailman 2.1.3, one is critical for private maillists Message-ID: Dear developers, I found two bugs, the first one is critical for private maillists: 1. When a user changes its password, the web authentication against the private archives or the membership options pages fails!. You can reproduce this bug even with this list like I did. I tried only with 2.1.3, not with 2.1.2 nor others. 2. The email delivery on/off commands don't work!. To try with 'set delivery on' I had to set delivery off using the web interface but neither it works by email. In both cases Mailman shows: "delivery option set", but the changes are not done. This bug ocurrs also with the Mailman versions included with RH 9 (2.1-8 and 2.1.1-4). I used: set authenticate and as a second line: set delivery off/on as I wrote, but set show returns the same. Please tell me if I'm wrong, and help me please that I'm implementing some private mailing lists for my institution. If there is no an easy (early) solution, could you please inform me, if the fixed bug reported in the release notes for Mailman 2.1.3 related to Postfix is valid for Mailman 2.1.1 or only for Mailman 2.1.2? For the first one: ------ begin ------ Bug in Mailman version 2.1.3 We're sorry, we hit a bug! If you would like to help us identify the problem, please email a copy of this page to the webmaster for this site with a description of what happened. Thanks! Traceback: Traceback (most recent call last): File "/usr/local/mailman-2.1/scripts/driver", line 87, in run_main main() File "/usr/local/mailman-2.1/Mailman/Cgi/options.py", line 206, in main password, user): File "/usr/local/mailman-2.1/Mailman/SecurityManager.py", line 219, in WebAuthenticate print self.MakeCookie(ac, user) File "/usr/local/mailman-2.1/Mailman/SecurityManager.py", line 228, in MakeCookie raise ValueError ValueError ... ---- end ---- Thank you very much in advance. Pablo Chamorro C. http://www.ingeominas.gov.co -- From ptsjohol at cc.jyu.fi Thu Oct 30 05:11:26 2003 From: ptsjohol at cc.jyu.fi (Pasi Sjoholm) Date: Sun Nov 2 20:45:07 2003 Subject: [Mailman-Developers] Re: [Mailman-Users] unsubscribe_policy problem? (fixed, includes a patch) In-Reply-To: Message-ID: Hello again, anyone didn't reply to me so I fixed this by myself, here is the patch: --- Mailman/Cgi/options.py~ Thu Oct 30 12:02:41 2003 +++ Mailman/Cgi/options.py Thu Oct 30 11:44:55 2003 @@ -156,9 +156,22 @@ def main(): if cgidata.has_key('login-unsub'): # Because they can't supply a password for unsubscribing, we'll need # to do the confirmation dance. + if mlist.isMember(user): - mlist.ConfirmUnsubscription(user, userlang) - doc.addError(_('The confirmation email has been sent.'), tag='') + # If we're doing admin-approved unsubs, don't worry about the password + if mlist.unsubscribe_policy: + try: + mlist.Lock() + try: + mlist.DeleteMember(user, 'via the listinfo page', userack=1) + except Errors.MMNeedApproval: + doc.addError(_('Your unsubscription request has been forwarded to the list administrator for approval.'), tag='') + mlist.Save() + finally: + mlist.Unlock() + else: + mlist.ConfirmUnsubscription(user, userlang) + doc.addError(_('The confirmation email has been sent.'), tag='') else: # Not a member if mlist.private_roster == 0: On Thu, 30 Oct 2003, Pasi Sjoholm wrote: > This is possible when user is not logged in but when logged there will be > a request to unsubscribe user "x" for list admin. So.. it's a bug =) I > also tested it on 2.2.3. > > -- > Pasi Sj?holm > > > On Thu, 30 Oct 2003, Pasi Sjoholm wrote: > > > Hello, > > > > I set up a mailman today and now I have a little problem. I have this > > corporate mailing list and I have set unsubscribe_policy to yes for that > > list but still users can unsubscribe via web gui after they have clicked > > the unsubscribe and confirmed that they really want to unsubscribe via > > email. > > > > I'm using Mailman 2.1.2... is this a bug or what? > > > > -- > > Pasi Sj?holm -------------- next part -------------- --- Mailman/Cgi/options.py~ Thu Oct 30 12:02:41 2003 +++ Mailman/Cgi/options.py Thu Oct 30 11:44:55 2003 @@ -156,9 +156,22 @@ def main(): if cgidata.has_key('login-unsub'): # Because they can't supply a password for unsubscribing, we'll need # to do the confirmation dance. + if mlist.isMember(user): - mlist.ConfirmUnsubscription(user, userlang) - doc.addError(_('The confirmation email has been sent.'), tag='') + # If we're doing admin-approved unsubs, don't worry about the password + if mlist.unsubscribe_policy: + try: + mlist.Lock() + try: + mlist.DeleteMember(user, 'via the listinfo page', userack=1) + except Errors.MMNeedApproval: + doc.addError(_('Your unsubscription request has been forwarded to the list administrator for approval.'), tag='') + mlist.Save() + finally: + mlist.Unlock() + else: + mlist.ConfirmUnsubscription(user, userlang) + doc.addError(_('The confirmation email has been sent.'), tag='') else: # Not a member if mlist.private_roster == 0: