Py2.3: Feedback on Sets

Raymond Hettinger vze4rx4y at verizon.net
Mon Aug 18 03:35:25 EDT 2003


> [Beni Cherniavsky]
> >> `isdisjoint` sounds like a good idea since it can be implemented
> >> more effeciently than ``not a & b``.
>
> [Raymond Hettinger]
> > True enough.  It is easy to putogether an early-out algorithm using
> > itertools.  So, it can be done a C speed.  The question is whether
> > there are sufficient use cases to warrant expanding an already fat API.

[Tim Peters]
> Early-out algorithms consume time to determine when the early-out condition
> obtains, and so can be a net loss (via consuming more time testing for
> early-out than is saved by getting out early).  See the long comment block
> before _siftup in heapq.py for a classic case where early-out usually loses
> big.  For isdisjoint() it's muddier (depends on whether your data is such
> that most sets fed to it are or aren't disjoint; if they're usually
> disjoint, code testing for early-out is a waste of time).

In this case, early-out is sometimes faster and is never slower
than bool(a&b).  This reason is that they both do exactly the
same steps until encountering the first common element; no extra
work is required for the early out test.  Instead, the for-loop just bails-out:

    def isdisjoint(self, other):
        """Return True if the sets do not share any common elements"""
        if len(self) <= len(other):
            little, big = self, other
        else:
            little, big = other, self
        for elem in ifilter(big._data.has_key, little):
            return False
        return True


Raymond Hettinger






More information about the Python-list mailing list