[Spambayes] Sending larger files

skip at pobox.com skip at pobox.com
Mon Apr 30 03:11:20 CEST 2007


    Jane> SpamBayes (version 1.0.4 now; 1.0.3 earlier) seems to work OK with
    Jane> my Outlook Express (ver.6) on my Windows XP Dell machine (1 GB
    Jane> memory). Small files (less than 1 MB) seem to transmit OK as well.
    Jane> Anytime I try to send a larger file(s) (especially greater than 1
    Jane> MB) the program bogs down and "processes" the file(s) either very
    Jane> slowly or not at all. In the meantime, "sb_tray.exe" is consuming
    Jane> 98-99% of the CPU.  File(s) greater than 2 MB essentially don't
    Jane> transmit -- i.e., the program just locks up.  Any suggestions?

If I read your note correctly, you're seeing some spike in the time it takes
to *send* a message once your message size gets up to around 1MB.  SpamBayes
doesn't filter messages you send, only those you receive.  Nevertheless, in
case I misunderstand what you wrote...

The amount of work SpamBayes has to do when scoring incoming messages is
proportional to the size of the message.  In theory it should slow down
more-or-less uniformly as this simple table suggests.  To create it, I
concatenated succesively more copies of the same message (105,860 bytes) and
fed the resulting mongo message to the SpamBayes command line tool,
sb_bnfilter.py.  The first run takes much longer because sb_bnfilter has to
start up the background server first.  After that it just feeds the incoming
message to the already running server.  As you can see, the time it takes to
process the message increases more-or-less linearly with the overall message
size.  (Also, see the attached plot of time as a function of the number of
message copies.)

    # copies     total msg size    time
    1            105860            2.88
    2            211720            0.84
    3            317580            0.80
    4            423440            0.87
    5            529300            1.04
    6            635160            1.11
    7            741020            1.07
    8            846880            1.23
    9            952740            1.19
    10          1058600            1.37
    11          1164460            1.43
    12          1270320            1.50
    13          1376180            1.45
    14          1482040            1.63
    15          1587900            1.74
    16          1693760            1.75
    17          1799620            1.81
    18          1905480            1.88
    19          2011340            1.95
    20          2117200            2.01

Skip

-------------- next part --------------
A non-text attachment was scrubbed...
Name: multispam.png
Type: image/png
Size: 3465 bytes
Desc: not available
Url : http://mail.python.org/pipermail/spambayes/attachments/20070429/9cec0350/attachment.png 


More information about the SpamBayes mailing list