[spambayes-dev] RE: [Spambayes] Intersection of two databases
Skip Montanaro
skip at pobox.com
Wed May 28 08:48:09 EDT 2003
>> Using that list, I then merged the corresponding entries from the two
>> source databases.
Tim> Skip, how did you do the merge? That is, word w in your database
Tim> had a certain hamcount and spamcount, while word w in Alex's had a
Tim> presumably different pair of counts. Did you add them? Take the
Tim> max? Something else?
I simply added them. I also added the 'saved state' values. This made
intuitive sense to me, though we all know intuition is often wrong. I was
effectively training using both databases, just eliminating the less useful
tokens. (Ignore for the moment that actually training on the complete set
of emails Alex and I have would probably have generated slightly different
results.)
Skip
More information about the spambayes-dev
mailing list