Dictionaries as records

John Roth johnroth at ameritech.net
Wed Dec 19 00:52:31 EST 2001


"Bill Wilkinson" <bwilk_97 at yahoo.com> wrote in message
news:QKQT7.27703$t07.3920324 at twister.midsouth.rr.com...
> I have been happily using a list of dictionaries to hold  table data
for
> years.
> For the first time, this method is proving less than efficient because
of
> the amount
> of memory overhead the dictionaries produce.  I have a file with 200K
> records and
> 16 fields.  This file is parsed and each row is put into a dictionary
and
> the dictionary is
> added to a list.  The raw file is only about 50mb.
>
> I was shocked to see that my memory use jumped to 500MB!   When I
delete the
> list the memory is returned to the system, so I know that the memory
is
> being used in the dictionaries.

Your records seem to average around 250 bytes, so that's about 16
characters per field. Object overhead for each object is roughly the
same as this (it might be larger, I haven't looked at the headers
recently.)

> What strikes me as odd is that I can create a list of 200K
dictionaries with
> test data (a copy of the same record over and over) and the amount of
memory
> used is only half.

The problem with this is that you may be reusing the same object in your
test. It's real easy to do unless you take precautions to insure that
you get
unique objects. If that turns out to be the case, then you've got a
solid
250K overhead for dictionaries. This actually seems reasonable; your
50M of data would expand to around 100M when the object headers
are added, and you would get the same amount for the keys unless you
made sure that you reused the same objects.

> Having read many of the articles on this newsgroup about how
dictionaries
> are sized, I am aware of some of the memory issues involved with using
a
> great number of dictionaries as I am.
>
> Can someone who has faced this issue and found a workaround please
fill me
> in. I know one can use a list of lists or a list of tuples, but had
rather
> stick to the dictionaries because of some library issues.
>
> Thanks in advance,
>
> Bill

John Roth
>
>





More information about the Python-list mailing list