frozendict: an experiment

Mon Jul 20 16:07:07 EDT 2020

I just finished to improve the performance of frozendict creation. The
result is very promising.

The speedup is about 30% for small dicts (8 items). For large dicts (1k
items) is about 38% for dicts with only integers as keys and values, and
45% for dicts with only strings.

There's also a little speedup in iteration.

Here's the code:

https://github.com/Marco-Sulla/cpython/commit/8d6c7f727a55d9d922c8a3a755fcb6c68ed26197

The benchmark output:

Name: `constructor(d)`;         Size:    8; Keys:      int; Type:
dict; Time: 2.956e-07
Name: `constructor(d)`;         Size:    8; Keys:      int; Type:
frozendict; Time: 2.100e-07
////////////////////////////////////////////////////////////////////////////////
Name: `constructor(d)`;         Size:    8; Keys:      str; Type:
dict; Time: 2.922e-07
Name: `constructor(d)`;         Size:    8; Keys:      str; Type:
frozendict; Time: 2.095e-07
////////////////////////////////////////////////////////////////////////////////
Name: `constructor(d)`;         Size: 1000; Keys:      int; Type:
dict; Time: 2.401e-05
Name: `constructor(d)`;         Size: 1000; Keys:      int; Type:
frozendict; Time: 1.309e-05
////////////////////////////////////////////////////////////////////////////////
Name: `constructor(d)`;         Size: 1000; Keys:      str; Type:
dict; Time: 1.822e-05
Name: `constructor(d)`;         Size: 1000; Keys:      str; Type:
frozendict; Time: 1.115e-05

You can find the tests and the benchmark code here:
https://github.com/Marco-Sulla/cpython/tree/master/frozendict/test

Here is the entire output of the benchmark:
https://github.com/Marco-Sulla/cpython/blob/master/frozendict/test/bench.txt

Notice that d[key] and the constructor(pairs_sequence) is a little slower.
I didn't touch them, so probably this is because I changed frozendict to a
split dict. So I suppose there's a lot of room for other improvements.

Equality check can be improved too, since the object address can be checked
first.

I think the major speedup is due the fact I implemented the solution with a
split dict, as suggested by Inada. Even if some of the improvements could
be applied to dict too, combined dicts could be more efficient and easy to
use for sorting, insertion and deletion.