[Tutor] If you don't close file when writing, do bytes stay in memory?

Xbox Muncher xboxmuncher at gmail.com
Sat Oct 10 16:34:55 CEST 2009


Oh yea, it's python 2.6.

On Sat, Oct 10, 2009 at 10:32 AM, Xbox Muncher <xboxmuncher at gmail.com>wrote:

> What does flush do technically?
> "Flush the internal buffer, like stdio‘s fflush(). This may be a no-op on
> some file-like objects."
>
> The reason I thought that closing the file after I've written about 500MB
> file data to it, was smart -> was because I thought that python stores that
> data in memory or keeps info about it somehow and only deletes this memory
> of it when I close the file.
> When I write to a file in 'wb' mode at 500 bytes at a time.. I see that the
> file size changes as I continue to add more data, maybe not in exact 500
> byte sequences as my code logic but it becomes bigger as I make more
> iterations still.
>
> Seeing this, I know that the data is definitely being written pretty
> immediately to the file and not being held in memory for very long. Or is
> it...? Does it still keep it in this "internal buffer" if I don't close the
> file. If it does, then flush() is exactly what I need to free the internal
> buffer, which is what I was trying to do when I closed the file anyways...
>
> However, from your replies I take it that python doesn't store this data in
> an internal buffer and DOES immediately dispose of the data into the file
> itself (of course it still exists in variables I put it in). So, closing the
> file doesn't free up any more memory.
>
>
> On Sat, Oct 10, 2009 at 7:02 AM, Dave Angel <davea at ieee.org> wrote:
>
>> xbmuncher wrote:
>>
>>> Which piece of code will conserve more memory?
>>>  I think that code #2 will because I close the file more often, thus
>>> freeing
>>> more memory by closing it.
>>> Am I right in this thinking... or does it not save me any more bytes in
>>> memory by closing the file often?
>>> Sure I realize that in my example it doesn't save much if it does... but
>>> I'm
>>> dealing with writing large files.. so every byte freed in memory counts.
>>> Thanks.
>>>
>>> CODE #1:
>>> def getData(): return '12345' #5 bytes
>>> f = open('file.ext', 'wb')
>>> for i in range(2000):
>>>    f.write(getData())
>>>
>>> f.close()
>>>
>>>
>>> CODE #2:
>>> def getData(): return '12345' #5 bytes
>>> f = open('file.ext', 'wb')
>>> for i in range(2000):
>>>    f.write(getData())
>>>    if i == 5:
>>>        f.close()
>>>        f = open('file.ext', 'ab')
>>>        i = 1
>>>    i = i + 1
>>>
>>> f.close()
>>>
>>>
>>>
>> You don't save a noticeable amount of memory usage by closing and
>> immediately reopening the file.  The amount that the system buffers probably
>> wouldn't depend on file size, in any case.  When dealing with large files,
>> the thing to watch is how much of the data you've got in your own lists and
>> dictionaries, not how much the file subsystem and OS are using.
>>
>> But you have other issues in your code.
>>
>> 1) you don't say what version of Python you're using.  So I'll assume it's
>> version 2.x.  If so, then range is unnecessarily using a lot of memory.  It
>> builds a list of ints, when an iterator would do just as well.  Use
>> xrange().  ( In Python 3.x, xrange() was renamed to be called range(). )
>>  This may not matter for small values, but as the number gets bigger, so
>> would the amount of wastage.
>>
>> 2) By using the same local for the for loop as for your "should I close"
>> counter, you're defeating the logic.  As it stands, it'll only do the
>> close() once.  Either rename one of these, or do the simpler test, of
>>     if i%5 == 0:
>>          f.close()
>>          f = open....
>>
>> 3) Close and re-open has three other effects.  One, it's slow.  Two,
>> append-mode isn't guaranteed by the C standard to always position at the end
>> (!).  And three, it flushes the data.  That can be a very useful result, in
>> case the computer crashes while spending a long time updating a file.
>>
>> I'd suggest sometimes doing a flush() call on the file, if you know you'll
>> be spending a long time updating it.  But I wouldn't bother closing it.
>>
>> DaveA
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20091010/a4f24404/attachment-0001.htm>


More information about the Tutor mailing list