[spambayes-dev] RE: [Spambayes] Intersection of two databases

Skip Montanaro skip at pobox.com
Wed May 28 08:48:09 EDT 2003


    >> Using that list, I then merged the corresponding entries from the two
    >> source databases.

    Tim> Skip, how did you do the merge?  That is, word w in your database
    Tim> had a certain hamcount and spamcount, while word w in Alex's had a
    Tim> presumably different pair of counts.  Did you add them?  Take the
    Tim> max?  Something else?

I simply added them.  I also added the 'saved state' values.  This made
intuitive sense to me, though we all know intuition is often wrong.  I was
effectively training using both databases, just eliminating the less useful
tokens.  (Ignore for the moment that actually training on the complete set
of emails Alex and I have would probably have generated slightly different
results.)

Skip



More information about the spambayes-dev mailing list