[Python-Dev] Retrieve an arbitrary element from a setwithoutremoving it

geremy condra debatem1 at gmail.com
Fri Nov 6 06:04:30 CET 2009


On Thu, Nov 5, 2009 at 11:21 PM, John Arbash Meinel
<john.arbash.meinel at gmail.com> wrote:
> geremy condra wrote:
>> On Thu, Nov 5, 2009 at 4:09 PM, Alexander Belopolsky
>> <alexander.belopolsky at gmail.com> wrote:
>>> On Thu, Nov 5, 2009 at 3:43 PM, Chris Bergstresser <chris at subtlety.com> wrote:
>>>> .. and "x = iter(s).next()" raises a StopIteration
>>>> exception.
>>> And that's why the documented recipe should probably recommend
>>> next(iter(s), default) instead.  Especially because iter(s).next() is
>>> not even valid code in 3.0.
>>
>> This seems reasonably legible to you? Strikes me as coding by
>> incantation. Also, while I've heard people say that the naive
>> approach is slower, I'm not getting that result. Here's my test:
>>
>>
>>>>> smrt = timeit.Timer("next(iter(s))", "s=set(range(100))")
>>>>> smrt.repeat(10)
>> [1.2845709323883057, 0.60247397422790527, 0.59621405601501465,
>> 0.59133195877075195, 0.58387589454650879, 0.56839084625244141,
>> 0.56839680671691895, 0.56877803802490234, 0.56905913352966309,
>> 0.56846404075622559]
>>
>>>>> naive = timeit.Timer("x=s.pop();s.add(x)", "s=set(range(100))")
>>>>> naive.repeat(10)
>> [0.93139314651489258, 0.53566789627075195, 0.53674602508544922,
>> 0.53608798980712891, 0.53634309768676758, 0.53557991981506348,
>> 0.53578495979309082, 0.53666114807128906, 0.53576493263244629,
>> 0.53491711616516113]
>>
>>
>> Perhaps my test is flawed in some way?
>>
>> Geremy Condra
>
> Well, it also uses a fairly trivial set. 'set(range(100))' is going to
> generally have 0 collisions and everything will hash to a unique bucket.
>  As such, pop ing an item out of the hash is a single "val = table[int &
> mask]; table[int & mask] = _dummy", and then looking it up again
> requires 2 table lookups (one finds the _dummy, the next finds that the
> chain is broken and can rewrite the _dummy.)
>
> However, if a set is more full, or has more collisions, or ... then
> pop() and add() become relatively more expensive.
>
> Surprising to me, is that "next(iter(s))" was actually slower than
> .pop() + .add() for 100 node set in my testing:
>
> $ alias TIMEIT="python -m timeit -s 's = set(range(100)'"
> $ TIMEIT "x = next(iter(s))"
> 1000000 loops, best of 3: 0.263 usec per loop
>
> $ TIMEIT "x = s.pop(); s.add(x)"
> 1000000 loops, best of 3: 0.217 usec per loop
>
> though both are much slower than the fastest we've found:
> $ TIMEIT "for x in s: break"
> 10000000 loops, best of 3: 0.0943 usec per loop
>
>
> now, what about a set with *lots* of collisions. Create 100 keys that
> all hash to the same bucket:
> aliase TIMEIT="python -m timeit -s 's = set([x*1024*1024 for x in
> range(100)])'"
> $ TIMEIT "x = next(iter(s))"
> 1000000 loops, best of 3: 0.257 usec per loop
>
> $ TIMEIT "x = s.pop(); s.add(x)"
> 1000000 loops, best of 3: 0.218 usec per loop
>
> I tried a few different ways, and I got the same results, until I did:
>
> $ python -m timeit -s "s = set(range(100000, 1000100))" "next(iter(s))"
> 1000 loops, best of 3: 255 usec per loop
>
> ^^^^ Now something seems terribly wrong here. next(iter(s)) suddenly
> jumps up from being < 0.3 us, to being more than 200us. Or ~1000x slower.
>
> I realize the above has 900k keys, which is big. But 'next(iter(s))'
> should only touch 1, and, in fact, should always return the *first*
> entry. My best guess is just that the *first* entry in the internal set
> table is no longer close to offset 0, which means that 'next(iter(s))'
> has to evaluate a bunch of table slots before it finds a non-empty entry.
>
>
> Anyway, none of the proposals will really ever be faster than:
>  for x in s: break
>
> It is a bit ugly of a construct, but you don't have an attribute lookup,
> etc. As long as you don't do:
>  for x in s: pass
>
> Then it stays nice and fast.
>
> John
> =:->

Thanks for the info. Doing a bit more digging it appears that taking
the lookup out of the picture puts the pop/add more or less on par
with the for x in s: break solution, with the former behaving more
predictably and the latter averaging marginally faster times. Either
way, the loss in readability isn't worth it to me to get the minor
improvement in performance.

Given that adding a set.pick method would incur half the lookup
cost that pop/add does, I think its reasonable to say that even
using the fastest method proposed to implement it won't differ
all that greatly from the least efficient proposal, and that its
therefore pointless to consider set.pick from an optimisation
standpoint. Aesthetics, of course, are another thing altogether.

Geremy Condra


More information about the Python-Dev mailing list