From skip at pobox.com Mon Jan 4 00:14:33 2010 From: skip at pobox.com (skip at pobox.com) Date: Sun, 3 Jan 2010 17:14:33 -0600 Subject: [spambayes-dev] Interesting Google Alerts output Message-ID: <19265.9433.89601.260963@montanaro.dyndns.org> I've had a Google Alerts search set up for SpamBayes for quite awhile. Recently, I've begun getting output such as this >>>>> "ga" == Google Alerts writes: ga> === Google Blogs Alert for: SpamBayes === ga> 71544 | Panthershoes12.ezua.com ga> By admin ga> Test and Computations per 14 CFR 21.303 and Aero Parts Mart Dwg. Author: ga> haypo: Recipients: Rhamphoryncus, ola, gregory.p.smith, haypo, jcea, ga> pitrou, sserrano: Date: 2008-08-20.14:06:15: SpamBayes Score: 0.0847759 ga> Manufactured by Wiha. ... ga> ga> Panthershoes12.ezua.com ga> ga> 72872 | Divisionshoes15.25u.com ga> By admin ga> Author: ncoghlan: Recipients: barry, erson, non, exarkun, ncoghlan, pitrou: ga> Date: 2008-09-09.15:29:38: SpamBayes Score: 1.17516e-10: Marked as ga> misclassified Christopher Bibbo, DO, DPM Marshfield Center 1000 N OAK AVE ga> MARSHFIELD, ... ga> ga> Divisionshoes15.25u.com ga> Note that it appears to be some sort of mailing list archive, though look at the date in the message and the apparent date in the URL. Whenever I click the links given I get a 403 Forbidden response. Both ezua.com and 25u.com seem to be owned by ChangeIP.com (also jkub.com, which appears in another search result not shown above). Their homepages advertise free dynamic dns. If this is some sort of spam setup it's hard for me to see how the spammers might benefit from it. Skip From skip at pobox.com Sat Jan 9 16:53:10 2010 From: skip at pobox.com (skip at pobox.com) Date: Sat, 9 Jan 2010 09:53:10 -0600 Subject: [spambayes-dev] Option to pass messages through Google Translate? Message-ID: <19272.42598.21506.131803@montanaro.dyndns.org> Crazy idea for the day... I wonder if it would make sense to add an option to pass incoming messages (or just message bodies) through Google Translate before scoring and training? This thought occurred to me just now while composing a message to the moderators of the new pyiran-organizers mailing list. Most of the messages are probably written in Persian/Farsi. I just translated a message using Google Translate telling it to figure out the input language. While the translation has the usual grammatical problems that make us snigger at its abilities it seems that it would certainly work well as SpamBayes input. Skip