email module windows and suse

Lev Elbert elbertlev at hotmail.com
Sun Apr 13 20:21:29 EDT 2008


On Apr 13, 3:55 pm, Tim Roberts <t... at probo.com> wrote:
> Lev Elbert <elbert... at hotmail.com> wrote:
>
> >I have to make a custom email module, based on the standard one. The
> >custom module has to be able to work with extremely large mails (1GB
> >+), having memory "footprint" much smaller.
>
> Then you have a design problem right from the start.  It is extremely rare
> to find a mail server today that will transmit email messages larger than a
> few dozen megabytes.  Even on a 100 megabit network, it's takes a minute
> and a half for a 1GB message to go from the server to the user's
> workstation.
>
> What are you really trying to do here?  In most cases, you would be better
> off storing your attachments on a web server and transmitting links in the
> email.
>
> >The modified program has to work in SUSE environment, while the
> >development is done under Windows.  I'm not too good with linux and do
> >not know if speedup in Windows translates one-to-one into speedup in
> >SUSE. For example, if the bottleneck is IO, in windows I can spawn a
> >separate thread or 2 to do "read-ahead".
>
> We would need more information on your processing to advise you on this.
> Disk I/O is slow, network I/O is slower.  You can't go any faster than your
> slowest link.
>
> >Are threads available and as effective in SUSE as they are in Windows?
>
> Threads are available in Linux.  There is considerable debate over the
> relative performace improvement.
> --
> Tim Roberts, t... at probo.com
> Providenza & Boekelheide, Inc.

Thank you.

I have a 100mb mail file. I just made a very simple expiremnt

the message_from_file method boils down to a loop:
1        while True:
2            data = fp.read(block_size)
3            if not data:
4                break
5            feedparser.feed(data)
6
Total time is 21 seconds (lines 1-6), while processing (non IO) lines
3-5 is 20 seconds. This means, that no IO optimization would help.
This also explains the following fact: changing the  block_size from
8K to 1M has almost no processing time impact. Also multithreading
wouldn't help.

I beleive I have to change Message class (more exactly: derive another
class, which would store pieces on a disk.



More information about the Python-list mailing list