Memory Allocation?

Dima Dorfman dima at trit.invalid
Mon Feb 7 20:01:13 EST 2005


On 2005-02-07, Chris S. <chrisks at NOSPAM.udel.edu> wrote:
> Is it possible to determine how much memory is allocated by an arbitrary 
> Python object?

The hardest part about answering this question is figuring out what
you want to count as being allocated for a particular object. That
varies widely depending on why you need this information, which is why
there isn't a "getobsize" or similar call.

For example, consider a dict. The expression

  x = {'alpha': 1, 'beta': 2}

allocates a dict object, a hash table to hold the values [1], and the
strings and integers [2]. So how much memory is used by x? The dict
object structure and hash table obviously belong to x. How about the
contents? Those can be shared. If you count them, the answer is
misleading because deleting that object won't free up that much
memory; but if you don't count them, then your answer isn't very
useful because the large part of the object is probably the contents.
Another possibility is to count how much memory would be released if
the object were to go away, but this requires inspecting the reference
count of every object that can be reached from the subject, and it
might still be wrong if there are cycles.

I expect that if you want to know the size, you can decide on the
semantics you want for your application. The answer might even depend
on your application, since only you know which parts of the object
really counts toward its size (e.g., the names of attributes on your
object probably don't count). But the answer won't be the same for my
application, so a generic "getobsize" doesn't help.

Once you know what you want, it's pretty easy to get an estimate.
You'll have to make some assumptions about Python internals and it
wouldn't be portable across versions. Write a function that dispatches
on the type of its argument. For simple objects, return their basic
size; for containers, return the sum of the basic size, aux storage
size, and sizes of the contents (call your function on each of them).
For any object, the basic size is type(x).__basicsize__. The size of
the auxilary storage depends on the object and probably on the length
of its contents; e.g., a list allocates 4 bytes per element, and a
dict 12 bytes per slot [3]. Many do overallocation, so you have to
account for that too.

The only thing the interpreter can really help with is to be able to
ask an object about how much auxiliary memory it allocated. Only
builtin types can do that, and only that type knows the answer. Having
that would save you from having to know, for example, that a dict
allocates 12 bytes per slot. Everything else doesn't need the
interpreter's support, and pure Python works just as well in your
module as in the standard library (that said, if you do write it, I'm
sure others might find it useful--you aren't the first one with a
desire for this kind of information).

(The above is about CPython. JPython probably has its own set of issues.)

Dima.

[1] Small dicts don't need this extra allocation. We can ignore that
    for now.

[2] Assuming, for the moment, that this isn't a constant expression.
    When it's a constant, the integers are preloaded as constants in
    the code object. Right now, that's an unnecessary complication.

[3] Numbers for CPython 2.4.



More information about the Python-list mailing list