[issue32846] Deletion of large sets of strings is extra slow

Serhiy Storchaka report at bugs.python.org
Thu Feb 15 09:25:51 EST 2018


Serhiy Storchaka <storchaka+cpython at gmail.com> added the comment:

For lists and dicts the time of deletion is virtually linear.

1 int   5.2   9.8
2 int   5.8   9.7
4 int   5.6   9.8
8 int   5.8   10.0
16 int   5.5   9.7
32 int   5.4   9.6
64 int   5.6   9.0
128 int   5.6   8.7
1 str   7.6   13.3
2 str   8.0   13.8
4 str   7.4   14.0
8 str   7.5   14.3
16 str   7.6   14.4
32 str   8.0   14.7
64 str   7.9   14.3
100 str   8.1   14.0

1 int   60.4   10.5
2 int   61.0   11.1
4 int   61.1   10.4
8 int   60.1   11.1
16 int   61.1   10.1
32 int   61.5   9.9
64 int   60.7   9.6
128 int   60.8   9.4
1 str   204.9   15.4
2 str   247.4   15.8
4 str   275.3   16.1
8 str   299.3   14.8
16 str   312.8   14.8
32 str   329.2   14.2
64 str   344.5   14.1
100 str   386.2   14.4

(third and forth columns are time in nanoseconds per item, for creation and deletion)

But when shuffle the input data before creating a collection, the deletion time becomes superlinear (and much slower).

1 int   17.4   38.9
2 int   19.1   41.3
4 int   24.4   46.3
8 int   28.0   49.5
16 int   28.1   53.4
32 int   29.8   67.2
64 int   35.2   90.6
128 int   42.6   143.1
1 str   21.1   56.3
2 str   24.3   58.2
4 str   27.1   63.6
8 str   27.3   73.9
16 str   29.4   99.2
32 str   34.3   144.8
64 str   41.8   229.5
100 str   46.3   338.4

In former examples (list, ordered dict) objects are iterated and deleted in order of creation. But in the set of strings items are deleted in non-predicable random order. I suppose the non-linear effect is related to memory management.

----------
nosy: +lemburg, tim.peters, twouters

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue32846>
_______________________________________


More information about the Python-bugs-list mailing list