[spambayes-dev] problems locating messages with bigrams

Skip Montanaro skip at pobox.com
Tue Jan 6 13:58:21 EST 2004


    >> for t in Classifier()._enhance_wordstream(tokenize(msg)):
    >>    ...
    >> 
    >> uses the current training database to decide which tokens should be
    >> generated,

    Tim> I think you're hallucinating here -- _enhance_wordstream() doesn't
    Tim> make any use of training data.

Hmmm...  I thought _enhance_wordstream() was the thing which "tiled" the
token space.  Why isn't this code in tokenize.py if it doesn't rely on
training data?

Skip



More information about the spambayes-dev mailing list