Py2.3: Feedback on Sets
Raymond Hettinger
vze4rx4y at verizon.net
Mon Aug 18 03:35:25 EDT 2003
> [Beni Cherniavsky]
> >> `isdisjoint` sounds like a good idea since it can be implemented
> >> more effeciently than ``not a & b``.
>
> [Raymond Hettinger]
> > True enough. It is easy to putogether an early-out algorithm using
> > itertools. So, it can be done a C speed. The question is whether
> > there are sufficient use cases to warrant expanding an already fat API.
[Tim Peters]
> Early-out algorithms consume time to determine when the early-out condition
> obtains, and so can be a net loss (via consuming more time testing for
> early-out than is saved by getting out early). See the long comment block
> before _siftup in heapq.py for a classic case where early-out usually loses
> big. For isdisjoint() it's muddier (depends on whether your data is such
> that most sets fed to it are or aren't disjoint; if they're usually
> disjoint, code testing for early-out is a waste of time).
In this case, early-out is sometimes faster and is never slower
than bool(a&b). This reason is that they both do exactly the
same steps until encountering the first common element; no extra
work is required for the early out test. Instead, the for-loop just bails-out:
def isdisjoint(self, other):
"""Return True if the sets do not share any common elements"""
if len(self) <= len(other):
little, big = self, other
else:
little, big = other, self
for elem in ifilter(big._data.has_key, little):
return False
return True
Raymond Hettinger
More information about the Python-list
mailing list