Performance of list vs. set equality operations

Patrick Maupin pmaupin at gmail.com
Wed Apr 7 21:53:57 EDT 2010


On Apr 7, 8:41 pm, Steven D'Aprano
<ste... at REMOVE.THIS.cybersource.com.au> wrote:
> On Wed, 07 Apr 2010 10:55:10 -0700, Raymond Hettinger wrote:
> > [Gustavo Nare]
> >> In other words: The more different elements two collections have, the
> >> faster it is to compare them as sets. And as a consequence, the more
> >> equivalent elements two collections have, the faster it is to compare
> >> them as lists.
>
> >> Is this correct?
>
> > If two collections are equal, then comparing them as a set is always
> > slower than comparing them as a list.  Both have to call __eq__ for
> > every element, but sets have to search for each element while lists can
> > just iterate over consecutive pointers.
>
> > If the two collections have unequal sizes, then both ways immediately
> > return unequal.
>
> Perhaps I'm misinterpreting what you are saying, but I can't confirm that
> behaviour, at least not for subclasses of list:
>
> >>> class MyList(list):
>
> ...     def __len__(self):
> ...             return self.n
> ...>>> L1 = MyList(range(10))
> >>> L2 = MyList(range(10))
> >>> L1.n = 9
> >>> L2.n = 10
> >>> L1 == L2
> True
> >>> len(L1) == len(L2)
>
> False
>
> --
> Steven

I think what he is saying is that the list __eq__ method will look at
the list lengths first.  This may or may not be considered a subtle
bug for the edge case you are showing.

If I do the following:

>>> L1 = range(10000000)
>>> L2 = range(10000000)
>>> L3 = range(10000001)
>>> L1 == L2
True
>>> L1 == L3
False
>>>

I don't even need to run timeit -- the "True" takes awhile to print
out, while the "False" prints out immediately.

Regards,
Pat



More information about the Python-list mailing list