[issue33205] GROWTH_RATE prevents dict shrinking

Inada Naoki report at bugs.python.org
Wed Jan 26 03:54:49 EST 2022


Inada Naoki <songofacandy at gmail.com> added the comment:

We do not have *fill* since Python 3.6.
There is a `dk_nentries` instead. But when `insertion_resize()` is called, `dk_nentries` is equal to `USABLE_FRACTION(dk_size)` (dk_size is `1 << dk_log2_size` for now). So it is different from *fill* in the old dict.

I chose `dk_used*3` as GROWTH_RATE because it reserves more spaces when there are dummies than when there is no dummy, as I described in the same comment:

> In case of dict growing without deletion, dk_size is doubled for each resize as current behavior.
> When there are deletion, dk_size is growing aggressively than Python 3.3 (used*2 -> used*3).  And it allows dict shrinking after massive deletions.

For example, when current dk_size == 16 and USABLE_FRACTION(dk_size) == 10, new dk_size is:

* used = 10 (dummy=0) -> 32 (31.25%)
* used = 9 (dummy=1)  -> 32 (28.125%)
(snip)
* used = 6 (dummy=4)  -> 32 (18.75%)
* used = 5 (dummy=5)  -> 16 (31.25%)
* used = 4 (dummy=6)  -> 16 (25%)
(snip)
* used = 2 (dummy=8)  -> 8 (25%)

As you can see, dict is more sparse when there is dummy than when there is no dummy, except used=5/dummy=5 case.

There may be a small room for improvement, especially for `used=5/dummy=5` case. But I am not sure it is worth enough to use more complex GROWTH_RATE than used*3.
Any good idea?

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue33205>
_______________________________________


More information about the Python-bugs-list mailing list