Find an Item in a Sorted List

Christopher A. Craig com-nospam at ccraig.org
Wed Feb 27 12:43:22 EST 2002


hankc at nospam.com (HankC) writes:

> Thanks all, for the help.  I have one more question regarding resource
> management.  My altered version of the above code:
> 
> dict = {}
> datafile = open(..., "r")
> datalines = datafile.readlines()
> for line in datalines:
>    name = line.split(",")[0]
>    dict[name] = line[:-1] # strip trailing \n
> ###
> datafile.close()
> del(datalines)
> 
> At the ### mark, I looks like have have two versions of the data
> resource - one list and one dictionary.  My questions:

If you have a sufficiently recent Python (2.0 should work) you can use

datafile = open(..., "r")
for line in datafile.xreadlines():
  .
  .
  .

Which will do the same thing, but without the memory consumption.
This new code makes a sort of iterator by creating an object where
each successive list access returns the next line.

> - does the del(datalines) call free the memory (not necessarily
> immediately) for the list resource?

It frees it immediately.

> - does the original version of the function, by combining readlines()
> with the for statement, negate the need for freeing the list?  In
> other words, will the datafile.close() statement in itself do the
> trick?

Yes, and no.

The list is read into a temporary variable, which will be freed when
it goes out of scope (I believe that is at the end of the for loop,
but I'm not sure.)  The datafile.close() call has no effect on said
temporary.  

> I appreciate the patience!  Things like garbage collection and for
> loops iterating through lists take a little getting used to.

Python only does garbage collection on cyclic objects most things are
collected by reference counts.  When something is deleted or goes out
of scope the reference count on it is reduced, when the count reaches
zero it is freed immediately.

The problem with this, and the reason that Python does garbage
collection, is that something like

a = []
a.append(a)
del(a)

has increased the reference count on 'a' twice, but only subtracted
one, so you now have a free object which garbage collection will
eventually catch (if it is turned on).

-- 
Christopher A. Craig <com-nospam at ccraig.org>
"Microprocessors cost $700 -- far too much for a tiny slice of refined,
 impurity-laced beach sand." Steve Gibson




More information about the Python-list mailing list