From rsahagun at betterbeverages.com Fri Feb 3 14:54:19 2006 From: rsahagun at betterbeverages.com (Roberto Sahagun) Date: Fri, 3 Feb 2006 07:54:19 -0600 Subject: [spambayes-dev] NO SUBJECT Message-ID: How can I spam all the email witch has a blank " " subject. I get good email from a person then the same person will send junk and 90% of the time the junk will not have a subject in the subject field. * # rsahagun at betterbeverages.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/spambayes-dev/attachments/20060203/b5b27c64/attachment.htm -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 2950 bytes Desc: not available Url : http://mail.python.org/pipermail/spambayes-dev/attachments/20060203/b5b27c64/attachment.jpe From webgota at webgota.com Sat Feb 18 23:56:13 2006 From: webgota at webgota.com (webgota at webgota.com) Date: Sat, 18 Feb 2006 19:56:13 -0300 (GMT-03:00) Subject: [spambayes-dev] contact - business Message-ID: <40243498.1140303386328.JavaMail.Junior@webgota> My name is Carlos and I work for webgota.com internet solutions. Just to make new contacts, I'm sending this e-mail to tell you that I'm available for all kind of on-line and internet free lance and temp jobs (HTML, PHP, ASP, MySQL, SQL Server, Flash, JAVA, Javascript, VBScript, Delphi and also Banners, Advertisements, Designs, Logos, Adobe PDF Docs, etc...). I'm really good working against the time and I'm also on skype, so if your are too busy or have too much to do and a very short deadline, please count on me to help. Skype me anytime (really, anytime) and have one more skillful partner at your side. You can see my work at: www.webgota.com (samples on the "show room" section) Thank you, A. Carlos da Silva www.webgota.com ?skype: webgota webgota at webgota.com +55 16 39163589 ?(skype me anytime) From skip at pobox.com Sun Feb 19 22:55:14 2006 From: skip at pobox.com (skip at pobox.com) Date: Sun, 19 Feb 2006 15:55:14 -0600 Subject: [spambayes-dev] Remixed your site to XHTML (fwd) Message-ID: <17400.59714.655533.556825@montanaro.dyndns.org> Forwarding this along. It was sent to the listed SF admins for the SpamBayes project. Probably deserves a broader audience. Skip -------------- next part -------------- An embedded message was scrubbed... From: "Target" Subject: Remixed your site to XHTML Date: Mon, 20 Feb 2006 20:05:16 -0000 Size: 10201 Url: http://mail.python.org/pipermail/spambayes-dev/attachments/20060219/2c7cb850/attachment.mht From tameyer at ihug.co.nz Mon Feb 20 07:01:25 2006 From: tameyer at ihug.co.nz (Tony Meyer) Date: Mon, 20 Feb 2006 19:01:25 +1300 Subject: [spambayes-dev] Remixed your site to XHTML (fwd) In-Reply-To: <17400.59714.655533.556825@montanaro.dyndns.org> References: <17400.59714.655533.556825@montanaro.dyndns.org> Message-ID: <235183EB-AD60-40D6-821A-54E135FC4F81@ihug.co.nz> > After using spambayes and loving it I decided to contribute seeing > as I had a couple of hours spare this afternoon. I took the liberty > of converting your site to XHTML 1.0 Strict standards and sprucing > it up a bit, site is now tableless and uses strict XHTML and CSS > styling so is much more search engine friendly. Hope you don?t > think I?ve taken a liberty just wanted to help in some way, if you > don?t like it please just ignore this mail :). > > [...] > If you would like to use it let me know and I?ll email you over a > zip of the files. It looks nice enough, I guess. However, I do like the simple edit-ht- files system that we have at the moment. Is integrating something like this feasible with that system? =Tony.Meyer From skip at pobox.com Mon Feb 20 14:46:54 2006 From: skip at pobox.com (skip at pobox.com) Date: Mon, 20 Feb 2006 07:46:54 -0600 Subject: [spambayes-dev] Remixed your site to XHTML (fwd) In-Reply-To: <235183EB-AD60-40D6-821A-54E135FC4F81@ihug.co.nz> References: <17400.59714.655533.556825@montanaro.dyndns.org> <235183EB-AD60-40D6-821A-54E135FC4F81@ihug.co.nz> Message-ID: <17401.51278.579355.781774@montanaro.dyndns.org> Tony> It looks nice enough, I guess. However, I do like the simple Tony> edit-ht- files system that we have at the moment. Is integrating Tony> something like this feasible with that system? I don't know. I would hope so. Maybe we should introduce this fellow to ht2html and see how hard he thinks it would be. Skip From o.laudy at fss.uu.nl Fri Feb 24 14:00:52 2006 From: o.laudy at fss.uu.nl (Olav) Date: Fri, 24 Feb 2006 14:00:52 +0100 Subject: [spambayes-dev] Bayesian classifier that uses Bayes factors Message-ID: <011301c63942$579d28e0$d0170a0a@fss.soliscom.uu.nl> Hello all, I have a simple idea for the implementation of a Bayesian classifier that uses Bayes factors. Suppose we have the word "viagra" in the following situation: The word was found in 10 ham mails, and was not seen in 20 ham mails (=total 30 ham emails) The word was found in 50 spam mails, and was not seen in 30 spam mails. The procedure now is to calculate g(w)=10/(10+20) b(w) = 50/(50+30) and then p(w)=b(w)/(b(w)+g(w)) I suggest the following calculation: first add a prior value of 1 to each cell (so no problem with non-observed words), then calculate the log(odds): LogOdds=log (( 11*31 ) / (21*51)) The standard deviation is given by stdev = sqrt( 1/11+1/21+1/51+1/31 ) Next is to calculate the Bayes factors that a word is a spam indicator versus that is not a spam indicator: help=pNorm (0, LogOdds), stdev ) where pNorm is in the words of Gary " the inverse normal function, used to derive a p-value from a normal-distributed random variable" Bayes factors is given by BF=help/(1-help) The interpretation is simple: if the value is larger than 1, it is more likely being spam. The number can be given a better interpretation, but for the moment, the criterion is: larger than 1=spam, smaller than 1=ham. For Bayes factor, the product rule applies: the total Bayes factor is the product of all the Bayes factors of the individual words in the email to be classified. BF_total=BF(word_1) * BF(word_2) *...* BF(word_n) Some values using 1 word: H: 10/10 S:50/50 BF=1 H: 100/100 S:500/500 BF=1 ----------------------------------- H: 1/2 S:3/4 BF=1.5 H: 10/20 S:30/40 BF=4.3 ----------------------------------- H: 3/10 S:50/10 BF=very small Any suggestions? All the best, Olav Laudy -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/spambayes-dev/attachments/20060224/c3adf2fb/attachment.html