From anderson.antony at hotmail.com Tue Jun 5 10:43:01 2007 From: anderson.antony at hotmail.com (Anderson Antony) Date: Tue, 05 Jun 2007 14:13:01 +0530 Subject: [spambayes-dev] your link has been uploaded Message-ID: An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/spambayes-dev/attachments/20070605/4b8f9562/attachment.htm From louis.mason at hotmail.com Fri Jun 8 14:28:06 2007 From: louis.mason at hotmail.com (Louis Mason) Date: Fri, 08 Jun 2007 17:58:06 +0530 Subject: [spambayes-dev] Your link has been uploaded Message-ID: NOTE:If we have offended you by sending an email, please read our FAQ's, else for link exchange read below. Dear Webmaster, My name is Louis and I wanted to let you know that we have already placed a link to your site. You can check it here: http://www.smithfibercast.com/resources/water-cooling.htm Your link details are as Follows: Water Cooler - Plumbed in pure chilled filtered water coolers and fountains. Kindly link back to our site with the following details: Title: GRE pipe Desc: Smith Fibercast is a world leader in providing GRE pipe. For nearly 60 years our "time-tested" products have proven their durability URL: http://www.smithfibercast.com Our HTML Code:

GRE pipe - Smith Fibercast is a world leader in providing GRE pipe. For nearly 60 years our "time-tested" products have proven their durability.

Please note that the link to your site will be active for 10 business days, if thereafter we do not detect a link to our site from your webpage, it will be assumed that you are not interested in reciprocal link and to be fair to our other link partners, we shall remove your link. We both can benefit from link exchange as it may send traffic through links and also improve ranking in search engines. Category: gas-piping / pipe-cleaning / water-cooling / piping-accessories / water-treatment /Miscellaneous. How do you find my email address? We found your site through google search and your active links page indicates that you are interested in link exchange business. This is the only reason we are mailing you. Are you into SPAMMING? No way! We are not spammers and are against spamming of any kind. We are sending this mail with sole intention of link exchange for mutual benefit. To see your options on how you can respond to this email, read below. How about my PRIVACY? We are not list sellers. We do not rent out email to anyone. Our sole purpose is to contact webmasters for the purpose of link exchange, nothing else. How many emails will you send to me We will send you three mails. One is link activation email, second is reminder email (within 3-5 days) and finally link de-activation email (within 7-10 days). If you choose to "unsubscribe" first time, you will not receive second and third mail. Are you not blacklisted by Spamcop, Habeas or any other email trust services? No, we believe we are not spamming or invading privacy of anyone. We are sending link exchange email and have disclosed our email frequency to you. Along with that, we are providing every option to move ahead with the link exchange or unsubscription. How can I respond to this email? You may respond to this mail in three ways: 1. You can choose to place link on your website. Please reply to this email to show your willingness for link exchange. 2. You may unsubscribe from this project because you do not like the linking website. However, we may approach you if you are interested in other link exchange project. We will deactivate your link at present. Please email back with the subject "Unsubscribe" 3. You may want to UNSUBSCRIBE COMPLETELY. We will never contact you again. You will be unsubscribed within 48 hours and your link will be deactivated. Please email back with the subject "Unsubscribe Completely". Email : Louis.Mason at hotmail.com password : lm~ASDF _________________________________________________________________ Catch the complete World Cup coverage with MSN http://content.msn.co.in/Sports/Cricket/Default.aspx From skip at pobox.com Sun Jun 10 17:53:01 2007 From: skip at pobox.com (skip at pobox.com) Date: Sun, 10 Jun 2007 10:53:01 -0500 Subject: [spambayes-dev] SpamBayes core_server.py and related bits merged to CVS HEAD Message-ID: <18028.7773.116803.932258@montanaro.dyndns.org> I just merged the code for a new application, core_server.py, to the SpamBayes CVS HEAD. When I checked in the changes on HEAD I got a bunch of messages like this: cvs diff: Tag 1.1 refers to a dead (removed) revision in file `core_server.py'. cvs diff: No comparison available. Pass `-N' to `cvs diff'? It looks like the checkin actually worked, but I've never seen that message before. This new application, scripts/core_server.py, is fundamentally the same as the preexisting POP3 proxy, but uses a simple plugin scheme to support different protocol adapters. The only plugin written so far is spambayes/ XMLRPCPlugin.py, which, as you might guess, allows messages to be scored using XML-RPC calls. There are two methods, score and score_mime. The latter is pretty much what we are used to - essentially shoot an email (or email-like) message over the pipe and get a score back. The former method (maybe it should be named score_form) accepts a dictionary representing a form submission, a set of extra tokens generated by the client (such as was the submission from an anonymous user?) and a set of attachments. The last two args can be empty (though because of XML-RPC constraints they can't be optional.) The first application of this is likely to the the new Roundup-based Python tracker. I wrote a simple Roundup auditor for that purpose today. I'll be testing that over the next few days. The second application will likely be MoinMoin. Marian Neagul is doing a Google Summer of Code project on page classification which this might fit into nicely. In theory, an web site can use it though as long as it can speak XML-RPC. BTW, before I merged I created a tag, BEFORE_CORESVR_MERGE. I also created an AFTER_CORESVR_MERGE tag after the big checkin. Can people give the existing applications a whirl to make sure I didn't break anything? For Reimar and Marian (the MoinMoin gurus), I did a very little bit of performance testing. Roundtrip performance on my laptop (Mac PowerBook G4 - 800MHz) with both the server and client running on the same machine ranged anywhere from 10-50 bytes/ms. When I added a large payload (a MIME encoded JPEG file of 9.5MB) performance in terms of bytes/ms shot way up, but as you would imagine overall time did as well. Here are some figures: attachment time bytes/ms size 9587824 30.7 sec 312 975978 3.7 sec 259 114794 0.5 sec 252 28675 0.2 sec 142 Thx, Skip From skip at pobox.com Sun Jun 10 19:06:13 2007 From: skip at pobox.com (skip at pobox.com) Date: Sun, 10 Jun 2007 12:06:13 -0500 Subject: [spambayes-dev] SpamBayes core_server.py and related bits merged to CVS HEAD In-Reply-To: <18028.7773.116803.932258@montanaro.dyndns.org> References: <18028.7773.116803.932258@montanaro.dyndns.org> Message-ID: <18028.12165.660582.554248@montanaro.dyndns.org> skip> For Reimar and Marian (the MoinMoin gurus), I did a very little skip> bit of performance testing. Roundtrip performance on my laptop skip> (Mac PowerBook G4 - 800MHz) with both the server and client skip> running on the same machine ranged anywhere from 10-50 bytes/ms. skip> When I added a large payload (a MIME encoded JPEG file of 9.5MB) skip> performance in terms of bytes/ms shot way up, but as you would skip> imagine overall time did as well. Here are some figures: skip> attachment time bytes/ms skip> size skip> 9587824 30.7 sec 312 skip> 975978 3.7 sec 259 skip> 114794 0.5 sec 252 skip> 28675 0.2 sec 142 I probably should have drawn some inferences from this. First, if you really try to score 100MB payloads (Reimer & Marian suggested that some people routinely attach 100MB Word (I think) files to wikis), you're going to be disappointed. Second, although attachments of that size would be problematic, since SpamBayes doesn't examine the guts of binary data, there's probably nothing wrong with trimming the binary file to a reasonable size (< 1MB?) and including that trimmed version in the score request. Also, note that I've really don't nothing with non-ASCII data to this point. I suspect people more familiar with that will see a clear path to sanity fairly easily. Skip From dave at boost-consulting.com Tue Jun 12 18:57:13 2007 From: dave at boost-consulting.com (David Abrahams) Date: Tue, 12 Jun 2007 12:57:13 -0400 Subject: [spambayes-dev] Near-twin ham/spam, Train-to-exhaustion, feature ideas Message-ID: <87ejkhxg9i.fsf@grogan.peloton> When I run the tte script, it always stops after 4 or 5 rounds. If it goes beyond 6 rounds it's a sure sign that I've misclassified something (**). What I do then is run the script with -v and let it show me the messages it's training on. In later runs it's always training one or two messages that are the culprits. I just look for those message IDs in TBird and move them into the right training sets. I do this training automatically on my server, so what I'd like to do is have the script automatically email me a notice identifying problem messages in my training set. Maybe it should even mark them deleted (I use IMAP) and restart the process. Thoughts? (**) Technically speaking, running for 6 or more rounds doesn't necessarily identify a misclassification. Sometimes it identifies a correctly-classified message for which there is an oppositely-classified near-twin. Today, it had some trouble with a "correctly" classified-as-ham Mailman moderation request message containing a piece of spam that I had also received directly and thus classified as spam. So everything was classified "correctly." What prompted me to look at this situation was that lots of ham started falling into my "unsure" folder today. So despite the fact that everything was classified correctly, overall performance was noticeably reduced. I'm tempted to conclude that TTE running for more than 5 rounds just indicates a classification that's bad for performance. I guess my next question is whether this near-twin classification (the difference being that one of the messages was a moderation request) is supposed to work well? I guess if I have some other moderation requests in my spam folder that could really confuse things... hmm, straightened that out and it still didn't finish training speedily. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com The Astoria Seminar ==> http://www.astoriaseminar.com From skip at pobox.com Thu Jun 14 14:13:35 2007 From: skip at pobox.com (skip at pobox.com) Date: Thu, 14 Jun 2007 07:13:35 -0500 Subject: [spambayes-dev] SpamBayes core_server.py and related bits merged to CVS HEAD In-Reply-To: <46712C24.9070807@info.uvt.ro> References: <18028.7773.116803.932258@montanaro.dyndns.org> <18028.12165.660582.554248@montanaro.dyndns.org> <46712C24.9070807@info.uvt.ro> Message-ID: <18033.12527.825945.716559@montanaro.dyndns.org> Marian> When this is not the case we could do the parsing locally, hash Marian> the tokens, build the feature vector and send it to the remote Marian> classifier. This way the local application would not disclose Marian> sensitive information. I'm not sure that will work. The SpamBayes classifier works off a token stream which does nothing to "hide" the tokens. Sure, it creates a set() of the tokens in the message, but if my phone number, email address and credit card information are in the message they will be in the token stream as well. Marian> Another important problem is related to using a single Marian> classifier for several application (possibly with a totally Marian> different content). IMHO the spam for an application might be Marian> totally different then the spam of another, or to be more exact: Marian> the ham/spam features might differ. In this case the result of Marian> the classification might not be relevant. "My SPAM is not your Marian> SPAM" :) Sure. Each wiki on a single physical server might well want its own spam classifier. Just set up the configuration bits properly (port numbers mostly) and fire up multiple classifiers. Skip From marian at info.uvt.ro Thu Jun 14 13:53:08 2007 From: marian at info.uvt.ro (Marian Neagul) Date: Thu, 14 Jun 2007 14:53:08 +0300 Subject: [spambayes-dev] SpamBayes core_server.py and related bits merged to CVS HEAD In-Reply-To: <18028.12165.660582.554248@montanaro.dyndns.org> References: <18028.7773.116803.932258@montanaro.dyndns.org> <18028.12165.660582.554248@montanaro.dyndns.org> Message-ID: <46712C24.9070807@info.uvt.ro> Hello, The 100MB files should not be a really big problem because we could truncate them to their headers (or just a part of them) and send them as normal features to SB. We should use a remote SB classifier only when the network and classifier are trusted. When this is not the case we could do the parsing locally, hash the tokens, build the feature vector and send it to the remote classifier. This way the local application would not disclose sensitive information. Another important problem is related to using a single classifier for several application (possibly with a totally different content). IMHO the spam for an application might be totally different then the spam of another, or to be more exact: the ham/spam features might differ. In this case the result of the classification might not be relevant. "My SPAM is not your SPAM" :) M. skip at pobox.com wrote: > skip> For Reimar and Marian (the MoinMoin gurus), I did a very little > skip> bit of performance testing. Roundtrip performance on my laptop > skip> (Mac PowerBook G4 - 800MHz) with both the server and client > skip> running on the same machine ranged anywhere from 10-50 bytes/ms. > skip> When I added a large payload (a MIME encoded JPEG file of 9.5MB) > skip> performance in terms of bytes/ms shot way up, but as you would > skip> imagine overall time did as well. Here are some figures: > > skip> attachment time bytes/ms > skip> size > skip> 9587824 30.7 sec 312 > skip> 975978 3.7 sec 259 > skip> 114794 0.5 sec 252 > skip> 28675 0.2 sec 142 > > I probably should have drawn some inferences from this. First, if you > really try to score 100MB payloads (Reimer & Marian suggested that some > people routinely attach 100MB Word (I think) files to wikis), you're going > to be disappointed. Second, although attachments of that size would be > problematic, since SpamBayes doesn't examine the guts of binary data, > there's probably nothing wrong with trimming the binary file to a reasonable > size (< 1MB?) and including that trimmed version in the score request. > > Also, note that I've really don't nothing with non-ASCII data to this point. > I suspect people more familiar with that will see a clear path to sanity > fairly easily. > > Skip > > From cf_105_rl206 at hotmail.com Fri Jun 22 01:46:50 2007 From: cf_105_rl206 at hotmail.com (Jason Eldridge) Date: Thu, 21 Jun 2007 16:46:50 -0700 Subject: [spambayes-dev] Outlook 2007 Support? Message-ID: Hi there, first I would like to thank you for such a lifesaving product as spambayes - at one point I was getting over 1,000 spam messages a day routed to me via 3 domain names - SpamBayes addressed 95% of the spam correctly - I have just rebuilt my PC and plan to use Office 2007 - your website and faqs do not have any info (that I could locate) on Outlook 2007 support? Please advise and keep up the good work! Jason currently posted to CFB Borden

 

_________________________________________________________________ Connect to the next generation of MSN Messenger? http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=wlmailtagline -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/spambayes-dev/attachments/20070621/d679d60e/attachment.html From skip at pobox.com Mon Jun 25 13:39:06 2007 From: skip at pobox.com (skip at pobox.com) Date: Mon, 25 Jun 2007 06:39:06 -0500 Subject: [spambayes-dev] Subversion conversion? Message-ID: <18047.43354.333535.298019@montanaro.dyndns.org> Quite sometime ago someone proposed a switch to Subversion. I expressed some concerns as the PSF was going through the switch at the time and the verdict wasn't quite in as far as I was concerned. I've since then gained enough experience with Subversion to no longer be such a Luddite. Should we switch now? After the next alpha release? After the next beta release? Never? Skip From skip at pobox.com Mon Jun 25 16:02:37 2007 From: skip at pobox.com (skip at pobox.com) Date: Mon, 25 Jun 2007 09:02:37 -0500 Subject: [spambayes-dev] 1.1a4 Message-ID: <18047.51965.621617.920785@montanaro.dyndns.org> I cut a 1.1a4 release this morning. The main new thing is the core_server app. I haven't yet updated the website, but will try to get to that a bit later today. Can someone with Windows please give the zip file a try? I followed Anthony's "rather eat live worms than use Windows" instructions for building a zip file. Skip From skip at pobox.com Tue Jun 26 01:02:41 2007 From: skip at pobox.com (skip at pobox.com) Date: Mon, 25 Jun 2007 18:02:41 -0500 Subject: [spambayes-dev] Website updated for 1.1a4 Message-ID: <18048.18833.587874.720840@montanaro.dyndns.org> I believe the website has been updated to reflect the 1.1a4 source release I cut today. The only sticking point was that I was unable to install the gpg signature files in htdocs/sigs because of group permission problems. I sent a note to Tony (he's the owner of that directory). If someone could add an Outlook installer that would be great. (Even greater would be if I could create an Outlook installer on my Mac, but I suppose that's not going to happen anytime soon. ;-) Skip From mhammond at skippinet.com.au Tue Jun 26 04:36:17 2007 From: mhammond at skippinet.com.au (Mark Hammond) Date: Tue, 26 Jun 2007 12:36:17 +1000 Subject: [spambayes-dev] Subversion conversion? In-Reply-To: <18047.43354.333535.298019@montanaro.dyndns.org> Message-ID: <00c401c7b79a$c6427840$0e0a0a0a@enfoldsystems.local> > Quite sometime ago someone proposed a switch to Subversion. > I expressed > some concerns as the PSF was going through the switch at the > time and the > verdict wasn't quite in as far as I was concerned. I've > since then gained > enough experience with Subversion to no longer be such a > Luddite. Should we > switch now? After the next alpha release? After the next > beta release? I'm happy to switch at any time, but would also be happy to stay with CVS given the low rate of checkins this project currently gets. Mark From skip at pobox.com Tue Jun 26 04:44:38 2007 From: skip at pobox.com (skip at pobox.com) Date: Mon, 25 Jun 2007 21:44:38 -0500 Subject: [spambayes-dev] Subversion conversion? In-Reply-To: <00c401c7b79a$c6427840$0e0a0a0a@enfoldsystems.local> References: <18047.43354.333535.298019@montanaro.dyndns.org> <00c401c7b79a$c6427840$0e0a0a0a@enfoldsystems.local> Message-ID: <18048.32150.680183.631952@montanaro.dyndns.org> >> I've since then gained enough experience with Subversion to no longer >> be such a Luddite. Should we switch now? After the next alpha >> release? After the next beta release? Mark> I'm happy to switch at any time, but would also be happy to stay Mark> with CVS given the low rate of checkins this project currently Mark> gets. My main motivation for switching sooner rather than later is that I have been doing much of my recent work on the train. Using Subversion would (I believe - correct me if I'm wrong) allow me to check my current diffs without being online. Skip From spambayes-dev at tangomu.com Tue Jun 26 09:12:24 2007 From: spambayes-dev at tangomu.com (Tony Meyer) Date: Tue, 26 Jun 2007 19:12:24 +1200 Subject: [spambayes-dev] Subversion conversion? In-Reply-To: <18048.32150.680183.631952@montanaro.dyndns.org> References: <18047.43354.333535.298019@montanaro.dyndns.org> <00c401c7b79a$c6427840$0e0a0a0a@enfoldsystems.local> <18048.32150.680183.631952@montanaro.dyndns.org> Message-ID: <5126ADE9-F0DC-457E-A7F7-BC1BFE62EE15@tangomu.com> > My main motivation for switching sooner rather than later is that I > have > been doing much of my recent work on the train. Using Subversion > would (I > believe - correct me if I'm wrong) allow me to check my current diffs > without being online. +1 to switching, and +1 to doing it now. This reason (diffing offline) is really the only noticeable difference for me, with the limited way that I use CVS/SVN, but it's a huge benefit (not just to be able to do it offline, but the big speed up and removal of the logging in requirement). Cheers, Tony From spambayes-dev at tangomu.com Tue Jun 26 09:19:19 2007 From: spambayes-dev at tangomu.com (Tony Meyer) Date: Tue, 26 Jun 2007 19:19:19 +1200 Subject: [spambayes-dev] Website updated for 1.1a4 In-Reply-To: <18048.18833.587874.720840@montanaro.dyndns.org> References: <18048.18833.587874.720840@montanaro.dyndns.org> Message-ID: <10595FC0-B07E-482D-9C98-F0CE05078918@tangomu.com> Thanks for biting the bullet and doing 1.1a4! > I believe the website has been updated to reflect the 1.1a4 source > release I > cut today. The only sticking point was that I was unable to > install the gpg > signature files in htdocs/sigs because of group permission > problems. I sent > a note to Tony (he's the owner of that directory). And this after I pointed out the same problem with another directory to Mark just a few weeks ago :) I believe this is fixed now - ping me again if it isn't. > If someone could add an Outlook installer that would be great. (Even > greater would be if I could create an Outlook installer on my Mac, > but I > suppose that's not going to happen anytime soon. ;-) Unless Mark beats me to it, I should be able to do this in a day or two (assuming the changes he made to py2exe removing the requirement for Outlook 2000 work!). I'll also check the .zip file at the same time, but I'm pretty sure we never had a problem with ones that Anthony did, so it should be fine. Cheers, Tony From 767704 at bk.ru Wed Jun 27 10:26:26 2007 From: 767704 at bk.ru (ЖБИ кольца Новосибирск) Date: Wed, 27 Jun 2007 12:26:26 +0400 Subject: [spambayes-dev] (no subject) In-Reply-To: <200706270825.l5R8NOoh018174@hostoprav.ru> References: <200706270825.l5R8NOoh018174@hostoprav.ru> Message-ID: ?????? ??? ?????? ?????? ???????? ????????? ?????? ???????? ? ????????????: +79134651001 (383)2129690 ------------------------------------ Mail.Ru - ??????, ????????, ???????! ------------------------------------ From 789034 at bk.ru Wed Jun 27 10:26:33 2007 From: 789034 at bk.ru (ЖБИ кольца Новосибирск) Date: Wed, 27 Jun 2007 12:26:33 +0400 Subject: [spambayes-dev] (no subject) In-Reply-To: <200706270825.l5R8NOoh018174@hostoprav.ru> References: <200706270825.l5R8NOoh018174@hostoprav.ru> Message-ID: ?????? ??? ?????? ?????? ???????? ????????? ?????? ???????? ? ????????????: +79134651001 (383)2129690 ------------------------------------ Mail.Ru - ??????, ????????, ???????! ------------------------------------ From mhammond at skippinet.com.au Fri Jun 29 04:07:00 2007 From: mhammond at skippinet.com.au (Mark Hammond) Date: Fri, 29 Jun 2007 12:07:00 +1000 Subject: [spambayes-dev] 1.1a4 In-Reply-To: <18047.51965.621617.920785@montanaro.dyndns.org> Message-ID: <000a01c7b9f2$2e315e40$0200a8c0@enfoldsystems.local> > I cut a 1.1a4 release this morning. The main new thing is > the core_server > app. I haven't yet updated the website, but will try to get > to that a bit > later today. Can someone with Windows please give the zip > file a try? I > followed Anthony's "rather eat live worms than use Windows" > instructions for > building a zip file. I've cut both a binary installer and a binary zip file for Windows. Sadly, I still checked a few things in before cutting it, so I'm not sure it still qualifies as truly 1.1a4 - but I think it is close enough :) http://starship.python.net/crew/mhammond/spambayes-1.1a4.exe http://starship.python.net/crew/mhammond/spambayes-1.1a4-win32.zip I haven't actually tested the installer yet, but have tested the resulting binaries. If people here would like to test them, we can move them to the official site for public consumption. Alternatively, if Tony or anyone else would like to repackage things, please be my guest! Note that these packages both include gocr.exe and the gocr.txt which notes the original location and GPL status of the executable. Cheers, Mark