Why custom objects take so much memory?

Hrvoje Niksic hniksic at xemacs.org
Wed Dec 19 03:04:01 EST 2007


Steven D'Aprano <steven at REMOVE.THIS.cybersource.com.au> writes:

> On Tue, 18 Dec 2007 21:13:14 +0100, Hrvoje Niksic wrote:
>
>> Each object takes 36 bytes itself: 4 bytes refcount + 4 bytes type ptr +
>> 4 bytes dict ptr + 4 bytes weakptr + 12 bytes gc overhead.  That's not
>> counting malloc overhead, which should be low since objects aren't
>> malloced individually.  Each object requires a dict, which consumes
>> additional 52 bytes of memory (40 bytes for the dict struct plus 12 for
>> gc).  That's 88 bytes per object, not counting malloc overhead.
>
> And let's not forget that if you're running on a 64-bit system, you
> can double the size of every pointer.

And of Py_ssize_t's, longs, ints with padding (placed between two
pointers).  Also note the price of 8-byte struct alignment.

> Is there a canonical list of how much memory Python objects take up?
> Or a canonical algorithm?
>
> Or failing either of those, a good heuristic?

For built-in types, you need to look at the code of each individual
object.  For user types, you can approximate by calculations such as
the above.

>> Then there's string allocation: your average string is 6 chars
>> long; add to that one additional char for the terminating zero.
>
> Are you sure about that? If Python strings are zero terminated, how
> does Python deal with this?
>
>>>> 'a\0string'[1]
> '\x00'

Python strings are zero-terminated so the pointer to string's data can
be passed to the various C APIs (this is standard practice, C++
strings do it too.)  Python doesn't rely on zero termination to
calculate string length.  So len('a\0string') will do the right thing,
but the string will internally store 'a\0string\0'.



More information about the Python-list mailing list