Retrieving an object from a set

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sat Jan 26 01:25:12 EST 2013


Arnaud Delobelle wrote:

> Dear Pythoneers,
> 
> I've got a seemingly simple problem, but for which I cannot find a
> simple solution.
> 
> I have a set of objects (say S) containing an object which is equal to
> a given object (say x). So
> 
>     x in S
> 
> is true.  So there is an object y in S which is equal to x.  My
> problem is how to retrieve y, without going through the whole set.

Why do you care? Since x == y, what benefit do you get from extracting the
actual object y?

I'm not necessarily saying that you *should not* care, only that it is a
very rare occurrence. The only thing I can think of is interning objects,
which is best done with a dict, not a set:


CACHE = {}

def intern(obj):
    return CACHE.setdefault(obj, obj)


which you could then use like this:

py> s = "hello world"
py> intern(s)
'hello world'
py> t = 'hello world'
py> t is s
False
py> intern(t) is s
True


However, there aren't very many cases where doing this is actually helpful.
Under normal circumstances, object equality is much more important than
identity, and if you find that identity is important to you, then you
probably should rethink your code.

So... now that I've told you why you shouldn't do it, here's how you can do
it anyway:

def retrieve(S, x):
    """Returns the object in set S which is equal to x."""
    s = set(S)  # make a copy of S
    s.discard(x)
    t = S.difference(s)
    if t:
        return t.pop()
    raise KeyError('not found')


S = set(range(10))
y = (1, 2, "hello world")
x = (1, 2, "hello world")
assert x is not y and x == y
S.add(y)
z = retrieve(S, x)
assert z is y


By the way, since this makes a copy of the set, it is O(n). The
straight-forward approach:

for element in S:
    if x == element:
        x = element
        break

is also O(n), but with less overhead. On the other hand, the retrieve
function above does most of its work in C, while the straight-forward loop
is pure Python, so it's difficult to say which will be faster. I suggest
you time them and see.




-- 
Steven




More information about the Python-list mailing list