[Spambayes-checkins] spambayes/spambayes classifier.py,1.14,1.15
Tim Peters
tim_one at users.sourceforge.net
Tue Dec 16 00:19:10 EST 2003
Update of /cvsroot/spambayes/spambayes/spambayes
In directory sc8-pr-cvs1:/tmp/cvs-serv14339/spambayes
Modified Files:
classifier.py
Log Message:
_enhance_wordstream(): Simplify and speed; repaired docstring; now
delivers the last token in the input stream too. NOT TESTED, though.
Index: classifier.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/spambayes/classifier.py,v
retrieving revision 1.14
retrieving revision 1.15
diff -C2 -d -r1.14 -r1.15
*** classifier.py 16 Dec 2003 04:59:58 -0000 1.14
--- classifier.py 16 Dec 2003 05:19:08 -0000 1.15
***************
*** 427,434 ****
def _enhance_wordstream(self, wordstream):
! """Add bigrams to the wordstream. This wraps the last token
! to the first one, so a small number of odd tokens might get
! generated from that, but it shouldn't be significant. Note
! that these are *token* bigrams, and not *word* bigrams - i.e.
'synthetic' tokens get bigram'ed, too.
--- 427,435 ----
def _enhance_wordstream(self, wordstream):
! """Add bigrams to the wordstream.
!
! For example, a b c -> a b "a b" c "b c"
!
! Note that these are *token* bigrams, and not *word* bigrams - i.e.
'synthetic' tokens get bigram'ed, too.
***************
*** 438,453 ****
If the experimental "Classifier":"x-use_bigrams" option is
! removed, this function can be removed, too."""
! p = None
! while True:
! try:
! if p:
! yield p
! q = wordstream.next()
! if p:
! yield "%s %s" % (p, q)
! p = q
! except StopIteration:
! break
def _wordinfokeys(self):
--- 439,451 ----
If the experimental "Classifier":"x-use_bigrams" option is
! removed, this function can be removed, too.
! """
!
! last = None
! for token in wordstream:
! yield token
! if last:
! yield "%s %s" % (last, token)
! last = token
def _wordinfokeys(self):
More information about the Spambayes-checkins
mailing list