How much memory used by a name

placid Bulkan at gmail.com
Wed Feb 14 23:51:04 EST 2007


On Feb 15, 11:08 am, Bruno Desthuilliers
<bdesth.quelquech... at free.quelquepart.fr> wrote:
> Bernard Lebel a écrit :
>
> > Diez: thanks, I will try that. However isn't sum() returning an
> > integer that here would represent the number of elements?
>
> Nope, it will return the sum of the length of the lines in the list. The
> long way to write it is:
>
> total = 0
> for line in thelist:
>    total += len(line)
>
>
>
> > Bruno: good question. We're talking about text files that can have
> > 300,000 lines, if not more. Currently, the way I have coded the file
> > writing, every line calls for a write() to the file object,
>
> Seems sensible so far...
>
> >  The file is on the network.
>
> Mmm... Let's guess : it's taking too much time ?-)
>
> > This is taking a long time,
>
> (You know what ? I cheated)
>
> > and I'm looking for ways to speed up this
> > process. I though that keeping the list in memory and dropping to the
> > file at the very end could be a possible approach.
>
> OTOH, if the list grows too big, you may end up swapping (ok, it would
> need a very huge list). A "mixed" solution may be to wrap the file in a
> "buffered" writer that only perform a real write when it's full. This
> would avoid effective i/o on each line while keeping memory usage
> reasonable. Another one would be async I/O, but I don't know if and how
> it could be done in Python (never had to manage such a problem myself).
>
> My 2 cents...

What i can suggest is to use threads (for async I/O). One approach
would be
to split the script into two, so you have your main thread generating
the
strings, then you use another thread (which has a Queue object) that
blocks
on the get method of the Queue, once it gets the string it then writes
it to
the file. Your main thread generates strings and keeps on adding this
to the
Queue of the second thread. How much of a speed advantage this will
provide
i do not know.

Email me ff you require assistance. I would be more than welcome to
help.

Cheers




More information about the Python-list mailing list