[Python-ideas] set.add() return value

Ralf W. Grosse-Kunstleve rwgk at yahoo.com
Fri Feb 13 07:46:10 CET 2009


> It's true that the results you found aren't consistent with O(1),
> but as I understand it, Python dicts are O(1) amortized ("on average
> over the long term"). Sometimes dicts resize, which is not a constant
> time operation, and sometimes the dict has to walk a short linked list,
> which depends on the proportion of hashes that lead to a collisions.

Thanks for the insight. I didn't know such a process is considered
O(1). I agree it is fair because in practice it seems to work very
well, but the "collisions" part can turn it into O(N) as demonstrated
(in a crude way) by the script below. Therefore the O(1) classification
is a bit misleading.

My other script was simplified. I did look at more data points. The
curve is amazingly flat but not a constant function.

It is frustrating that simple requests like wanting a useful True or
False, that costs nothing, instead of a useless None, tend to digress
like this ...


import os

class normal(object):
  def __init__(O, value):
    O.value = value
  def __hash__(O):
    return O.value.__hash__()

class bad(object):
  def __init__(O, value):
    O.value = value
  def __hash__(O):
    return 0

for N in [1000, 10000]:
  t0 = os.times()[0]
  s = set([normal(i) for i in xrange(N)])
  print os.times()[0]-t0
  assert len(s) == N
  t0 = os.times()[0]
  s = set([bad(i) for i in xrange(N)])
  print os.times()[0]-t0
  assert len(s) == N



More information about the Python-ideas mailing list