Tree structure consuming lot of memory

mayank gupta mooniitk at gmail.com
Mon Jul 6 16:30:28 EDT 2009


I worked out a small code which initializes about 1,000,000 nodes with some
attributes, and saw the memory usage on my linux machine (using 'top'
command). Then just later I averaged out the memory usage per node. I know
this is not the most accurate way but just for estimated value.

The kind of Node class I am working on in my original code is like :

class Node:
     def __init__(self, #attributes ):
             self.coordinates = coordinates
             self.index = index
             self.sibNum = sibNum
             self.branchNum - branchNum

#here 'coordinates' and 'index' are LISTS with length = "dimension", where
"dimension" is a user-input.

The most shocking part of it after the memory-analysis was that, the memory
usage was never dependent on the "dimension". Yeah it varied a bit, but
there wasnt any significant changes in the memory usage even when the
"dimension" was doubled

-- Any clues?

Thank you for all your suggestions till this point.

Regards.




On Tue, Jul 7, 2009 at 1:28 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:

> mayank gupta <mooniitk <at> gmail.com> writes:
> >
> > After a little analysis, I found out that in general it uses about
> > 1.4 kb of memory for each node!!
>
> How did you measure memory use? Python objects are not very compact, but
> 1.4KB
> per object seems a bit too much (I would expect more about 150-200
> bytes/object
> in 32-bit mode, or 300-400 bytes/object in 64-bit mode).
>
> One of the solutions is to use __slots__ as already suggested. Another,
> which
> will have similar benefits, is to use a namedtuple. Both suppress the
> instance
> dictionnary (`instance`.__dict__), which is a major contributor to memory
> consumption. Illustration (64-bit mode, by the way):
>
> >>> import sys
> >>> from collections import namedtuple
>
> # First a normal class
> >>> class Node(object): pass
> ...
> >>> o = Node()
> >>> o.value = 1
> >>> o.children = ()
> >>>
> >>> sys.getsizeof(o)
> 64
> >>> sys.getsizeof(o.__dict__)
> 280
> # The object seems to take a mere 64 bytes, but the attribute dictionnary
> # adds a whoppy 280 bytes and bumps actual size to 344 bytes!
>
> # Now a namedtuple (a tuple subclass with property accessors for the
> various
> # tuple items)
> >>> Node = namedtuple("Node", "value children")
> >>>
> >>> o = Node(value=1, children=())
> >>> sys.getsizeof(o)
> 72
> >>> sys.getsizeof(o.__dict__)
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> AttributeError: 'Node' object has no attribute '__dict__'
>
> # The object doesn't have a __dict__, so 72 bytes is its real total size.
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>



-- 
I luv to walk in rain bcoz no one can see me crying
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20090707/e92881d3/attachment-0001.html>


More information about the Python-list mailing list