[Spambayes-checkins] spambayes tokenizer.py,1.38,1.39
Tim Peters
tim_one@users.sourceforge.net
Thu, 26 Sep 2002 17:08:15 -0700
Update of /cvsroot/spambayes/spambayes
In directory usw-pr-cvs1:/tmp/cvs-serv13081
Modified Files:
tokenizer.py
Log Message:
stylesheet_re: removed the IGNORCASE. The text is already lower()ed,
and IGNORECASE makes the engine do extra work.
Index: tokenizer.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/tokenizer.py,v
retrieving revision 1.38
retrieving revision 1.39
diff -C2 -d -r1.38 -r1.39
*** tokenizer.py 26 Sep 2002 20:26:02 -0000 1.38
--- tokenizer.py 27 Sep 2002 00:08:13 -0000 1.39
***************
*** 584,589 ****
# An equally cheap-ass gimmick to strip style sheets
! stylesheet_re = re.compile(r"<style>.{0,2000}?</style>",
! re.IGNORECASE|re.DOTALL)
received_host_re = re.compile(r'from (\S+)\s')
--- 584,588 ----
# An equally cheap-ass gimmick to strip style sheets
! stylesheet_re = re.compile(r"<style>.{0,2000}?</style>", re.DOTALL)
received_host_re = re.compile(r'from (\S+)\s')