[spambayes-dev] problems locating messages with bigrams
Skip Montanaro
skip at pobox.com
Tue Jan 6 13:58:21 EST 2004
>> for t in Classifier()._enhance_wordstream(tokenize(msg)):
>> ...
>>
>> uses the current training database to decide which tokens should be
>> generated,
Tim> I think you're hallucinating here -- _enhance_wordstream() doesn't
Tim> make any use of training data.
Hmmm... I thought _enhance_wordstream() was the thing which "tiled" the
token space. Why isn't this code in tokenize.py if it doesn't rely on
training data?
Skip
More information about the spambayes-dev
mailing list