[Tutor] Why do sets use 6 times as much memory as lists?

Wed May 8 03:30:21 CEST 2013

On Tue, May 7, 2013 at 2:09 PM, Bjorn Madsen
<bjorn.h.madsen at googlemail.com> wrote:
>
> "Being curious" I actually expected something like this:
>>>> L={x:None for x in range(10000)}
>>>> sys.getsizeof(L)
> 196660
>
> That is why I wondered why 6 times is a lot given that a dict can do the
> same at 3/4 of the mem-footprint. Just got surprised about the overhead
> as "sets" some are more lightweight than dictionaries.

Minus the basic object size, for the same size table a set should be
about 2/3 the size of a dict.

I suppose you're using Python 3.3. To reduce memory usage, the dict
type in 3.3 allows splitting tables to share keys/hashes among
instances:

http://www.python.org/dev/peps/pep-0412

That's not directly related to the size difference here, but at the
same time it was decided to also cut the initial growth rate to save
memory. In previous versions when the table is 2/3 full it gets
quadrupled in size, which slows to doubling after 50,000 active keys.
In 3.3 a dict always uses doubling.

So in 3.2 a fresh dict or set with 10,000 entries will have a table
sized 32768, while in 3.3 the table is sized 16384 for a dict and
32768 for a set. Keep in mind that as keys are added and removed the
table sizes will diverge, even for same number of active keys. Dummy
keys left in the table count toward the table utilization, but get
purged by a resize.