[Spambayes] sharing split database
bill parducci
bill at parducci.net
Tue May 20 09:48:40 EDT 2003
Skip Montanaro wrote:
> My "word list" contains a lot of domain names and email addresses like
>
> from:name:concertmaster at musi-cal.com
>
> I imagine that sort of stuff would be sensitive to some people. I think if
> you want a shared word list you'll have to selective about what's shared.
although, if shared words were harvested automatically (e.g. 'if
word/weight not found, insert word...'), the only 'local' information
would be the key/weight pair. as long as the shared words db didn't
maintain source info (don't know why it would... <broken record>
although insert date would be nice for freshness management :o) </broken
record>) the information would be anonymous. also, email addresses,
required to be sent in the clear over the wire, are unlikely to be
sensitive. that just leaves the payload itself and since, by definition,
everything is broken up into tokens, context should be pretty much obscured.
b
More information about the Spambayes
mailing list