[Spambayes] sharing split database

bill parducci bill at parducci.net
Tue May 20 09:48:40 EDT 2003



Skip Montanaro wrote:
> My "word list" contains a lot of domain names and email addresses like
> 
>     from:name:concertmaster at musi-cal.com
> 
> I imagine that sort of stuff would be sensitive to some people.  I think if
> you want a shared word list you'll have to selective about what's shared.

although, if shared words were harvested automatically (e.g. 'if 
word/weight not found, insert word...'), the only 'local' information 
would be the key/weight pair. as long as the shared words db didn't 
maintain source info (don't know why it would... <broken record> 
although insert date would be nice for freshness management :o) </broken 
record>) the information would be anonymous. also, email addresses, 
required to be sent in the clear over the wire, are unlikely to be 
sensitive. that just leaves the payload itself and since, by definition, 
everything is broken up into tokens, context should be pretty much obscured.

b




More information about the Spambayes mailing list