Why custom objects take so much memory?

Hrvoje Niksic hniksic at xemacs.org
Tue Dec 18 15:13:14 EST 2007


jsanshef <jsanpedro at gmail.com> writes:

> That means every object is around 223 bytes in size!!!! That's too
> much considering it only contains a string with a maximum size of 7
> chars.

The list itself consumes 4 MB because it stores 1 million PyObject
pointers.  It possibly consumes more due to overallocation, but let's
ignore that.

Each object takes 36 bytes itself: 4 bytes refcount + 4 bytes type ptr
+ 4 bytes dict ptr + 4 bytes weakptr + 12 bytes gc overhead.  That's
not counting malloc overhead, which should be low since objects aren't
malloced individually.  Each object requires a dict, which consumes
additional 52 bytes of memory (40 bytes for the dict struct plus 12
for gc).  That's 88 bytes per object, not counting malloc overhead.

Then there's string allocation: your average string is 6 chars long;
add to that one additional char for the terminating zero.  The string
struct takes up 20 bytes + string length, rounded to nearest
alignment.  For your average case, that's 27 bytes, rounded (I assume) to 28.
You also allocate 1024*1024 integers which are never freed (they're
kept on a free list), and each of which takes up at least 12 bytes.

All that adds up to 128 bytes per object, dispersed over several
different object types.  It doesn't surprise me that Python is eating
200+ MB of memory.

> So, what's exactly going on behind the scenes? Why is using custom
> objects SO expensive? What other ways of creating structures can be
> used (cheaper in memory usage)?
>
> Thanks a lot in advance!

Use a new-style class and set __slots__:

class MyClass(object):
    __slots__ = 'mystring',
    def __init__(self, s):
        self.mystring = s

That brings down memory consumption to ~80MB, by cutting down the size
of object instance and removing the dict.



More information about the Python-list mailing list