Python object overhead?

Jean-Paul Calderone exarkun at divmod.com
Sat Mar 24 17:03:33 EDT 2007


On 24 Mar 2007 13:52:46 -0700, 7stud <bbxx789_05ss at yahoo.com> wrote:
>On Mar 24, 2:19 pm, Jean-Paul Calderone <exar... at divmod.com> wrote:
>> Only one list is created.  It is used to define a C array where attributes
>> will be stored.  Each instance still has that C array, but it has much less
>> overhead than a Python list or dictionary.
>>
>
>It's all C underneath, right?  So a dictionary is what?  Two parallel
>C arrays: one array with the names and one with the values?
>
>When you do this:
>
>__slots__ = ["m", "n"]
>
>there still has to be a mapping between the names "m" and "n" and
>their values.  So aren't two parallel C arrays required again?
>

Not two *per instance* though.  The per-class overhead is roughly irrelevant
since the number of classes is generally small compared to the number of
instances.  If you have a circumstance where this isn't true, then it may
be worth considering per-class overhead rather than focusong on per-instance
overhead as I think most people in this thread have done.

>> Whether this reduction in overhead actually results in a useful or measurable
>> performance improvement is a separate question, of course.
>>
>
>I suppose a dictionary will be required to grow in size, so at
>periodic intervals new, bigger arrays will need to be created, the
>contents copied over, and the old arrays destroyed(similar to a
><vector> in C++).  And of course, the C code that accomplishes all
>that creating, copying and destroying will take up some memory.
>
>Is that what you are talking about when you say "overhead"?
>

I mostly had the Python object overhead and the overallocation overhead
in mind.  The code for resizing dicts is around whether you are using it
or not, after all.  You might want to take a look at the how dicts and
lists and __slots__ are implemented if you're interested in the details.

Jean-Paul



More information about the Python-list mailing list