[Tutor] If you don't close file when writing, do bytes stay in memory?
Dave Angel
davea at ieee.org
Sat Oct 10 13:02:08 CEST 2009
xbmuncher wrote:
> Which piece of code will conserve more memory?
> I think that code #2 will because I close the file more often, thus freeing
> more memory by closing it.
> Am I right in this thinking... or does it not save me any more bytes in
> memory by closing the file often?
> Sure I realize that in my example it doesn't save much if it does... but I'm
> dealing with writing large files.. so every byte freed in memory counts.
> Thanks.
>
> CODE #1:
> def getData(): return '12345' #5 bytes
> f = open('file.ext', 'wb')
> for i in range(2000):
> f.write(getData())
>
> f.close()
>
>
> CODE #2:
> def getData(): return '12345' #5 bytes
> f = open('file.ext', 'wb')
> for i in range(2000):
> f.write(getData())
> if i == 5:
> f.close()
> f = open('file.ext', 'ab')
> i = 1
> i = i + 1
>
> f.close()
>
>
You don't save a noticeable amount of memory usage by closing and
immediately reopening the file. The amount that the system buffers
probably wouldn't depend on file size, in any case. When dealing with
large files, the thing to watch is how much of the data you've got in
your own lists and dictionaries, not how much the file subsystem and OS
are using.
But you have other issues in your code.
1) you don't say what version of Python you're using. So I'll assume
it's version 2.x. If so, then range is unnecessarily using a lot of
memory. It builds a list of ints, when an iterator would do just as
well. Use xrange(). ( In Python 3.x, xrange() was renamed to be called
range(). ) This may not matter for small values, but as the number gets
bigger, so would the amount of wastage.
2) By using the same local for the for loop as for your "should I close"
counter, you're defeating the logic. As it stands, it'll only do the
close() once. Either rename one of these, or do the simpler test, of
if i%5 == 0:
f.close()
f = open....
3) Close and re-open has three other effects. One, it's slow. Two,
append-mode isn't guaranteed by the C standard to always position at the
end (!). And three, it flushes the data. That can be a very useful
result, in case the computer crashes while spending a long time updating
a file.
I'd suggest sometimes doing a flush() call on the file, if you know
you'll be spending a long time updating it. But I wouldn't bother
closing it.
DaveA
More information about the Tutor
mailing list