[Spambayes-checkins] spambayes/spambayes Options.py, 1.140, 1.141 tokenizer.py, 1.47, 1.48
Mark Hammond
mhammond at users.sourceforge.net
Mon Mar 26 09:57:16 CEST 2007
Update of /cvsroot/spambayes/spambayes/spambayes
In directory sc8-pr-cvs8.sourceforge.net:/tmp/cvs-serv2236/spambayes
Modified Files:
Options.py tokenizer.py
Log Message:
x-crack-images, x-ocr-engine and x-image-size all get upgraded to
non experimental options - congratulations, options!
Index: Options.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/spambayes/Options.py,v
retrieving revision 1.140
retrieving revision 1.141
diff -C2 -d -r1.140 -r1.141
*** Options.py 12 Feb 2007 11:24:59 -0000 1.140
--- Options.py 26 Mar 2007 07:57:13 -0000 1.141
***************
*** 120,137 ****
PATH, RESTORE),
! ("x-image_size", _("Generate image size tokens"), False,
! _("""(EXPERIMENTAL) If true, generate tokens based on the sizes of
embedded images."""),
BOOLEAN, RESTORE),
! ("x-crack_images", _("Look inside images for text"), False,
! _("""(EXPERIMENTAL) If true, generate tokens based on the
(hopefully) text content contained in any images in each message.
The current support is minimal, relies on the installation of
! an OCR 'engine' (see x-ocr_engine.)"""),
BOOLEAN, RESTORE),
! ("x-ocr_engine", _("OCR engine to use"), "",
! _("""(EXPERIMENTAL) The name of the OCR engine to use. If empty, all
supported engines will be checked to see if they are installed.
Engines currently supported include ocrad
--- 120,137 ----
PATH, RESTORE),
! ("image_size", _("Generate image size tokens"), False,
! _("""If true, generate tokens based on the sizes of
embedded images."""),
BOOLEAN, RESTORE),
! ("crack_images", _("Look inside images for text"), False,
! _("""If true, generate tokens based on the
(hopefully) text content contained in any images in each message.
The current support is minimal, relies on the installation of
! an OCR 'engine' (see ocr_engine.)"""),
BOOLEAN, RESTORE),
! ("ocr_engine", _("OCR engine to use"), "",
! _("""The name of the OCR engine to use. If empty, all
supported engines will be checked to see if they are installed.
Engines currently supported include ocrad
Index: tokenizer.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/spambayes/tokenizer.py,v
retrieving revision 1.47
retrieving revision 1.48
diff -C2 -d -r1.47 -r1.48
*** tokenizer.py 12 Feb 2007 11:25:00 -0000 1.47
--- tokenizer.py 26 Mar 2007 07:57:13 -0000 1.48
***************
*** 1616,1620 ****
parts = imageparts(msg)
! if options["Tokenizer", "x-image_size"]:
# Find image/* parts of the body, calculating the log(size) of
# each image.
--- 1616,1620 ----
parts = imageparts(msg)
! if options["Tokenizer", "image_size"]:
# Find image/* parts of the body, calculating the log(size) of
# each image.
***************
*** 1635,1640 ****
yield "image-size:2**%d" % round(log2(total_len))
! if options["Tokenizer", "x-crack_images"]:
! engine_name = options["Tokenizer", 'x-ocr_engine']
from spambayes.ImageStripper import crack_images
text, tokens = crack_images(engine_name, parts)
--- 1635,1640 ----
yield "image-size:2**%d" % round(log2(total_len))
! if options["Tokenizer", "crack_images"]:
! engine_name = options["Tokenizer", 'ocr_engine']
from spambayes.ImageStripper import crack_images
text, tokens = crack_images(engine_name, parts)
More information about the Spambayes-checkins
mailing list