[New-bugs-announce] [issue30907] speed up comparisons to self for built-in containers

Wouter Bolsterlee report at bugs.python.org
Wed Jul 12 09:15:31 EDT 2017


New submission from Wouter Bolsterlee:

when comparing instances of the built-in container types (dict, list, and others) python assumes that "identity implies equality". this is documented (and assumed) behaviour:


"In enforcing reflexivity of elements, the comparison of collections assumes that for a collection element x, x == x is always true. Based on that assumption, element identity is compared first, and element comparison is performed only for distinct elements."

https://docs.python.org/3/reference/expressions.html#value-comparisons

because of this, various operations such as rich comparison of lists can be sped up by checking for identity first, and only falling back to (possibly) much slower rich comparisons on container elements when two elements are not identical (i.e. id(a) == id(b)).

because identity implies equality, comparing a container instance to itself is guaranteed to be true.

currently, comparing a list to itself takes O(n) time, which can be measured easily:

>>> x = list(range(1000))
>>> %timeit x == x
2.95 µs ± 19.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

>>> y = list(range(100000))
>>> %timeit y == y
293 µs ± 3.35 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

the second example is 100 times as slow.

i will shortly submit a few pull request to make "compare to self" run in O(1) time instead for a few of the built-in containers.

----------
components: Interpreter Core
messages: 298213
nosy: wbolster
priority: normal
severity: normal
status: open
title: speed up comparisons to self for built-in containers
versions: Python 3.7

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue30907>
_______________________________________


More information about the New-bugs-announce mailing list