[Python-ideas] Changing `Sequence.__contains__`

Mark Lawrence breamoreboy at yahoo.co.uk
Mon Jul 21 04:40:49 CEST 2014


On 21/07/2014 03:18, Andrew Barnert wrote:
> On Sunday, July 20, 2014 7:06 PM, Mark Lawrence <breamoreboy at yahoo.co.uk> wrote:
>
>>> On 20/07/2014 23:06, Ram Rachum wrote:
>>>   Why does the default `Sequence.__contains__` iterate through the items
>>>   rather than use `.index`, which may sometimes be more efficient?
>>>
>>>   I suggest an implementation like this:
>>>
>>>        def __contains__(self, i):
>>>            try: self.index(i)
>>>            except ValueError: return False
>>>            else: return True
>>>   What do you think?
>>>
>>
>> I don't see how that can be more efficient than the naive
>>
>> def __contains__(self, i):
>>       for elem in self:
>>           if elem == i:
>>               return True
>>       return False
>>
>> What am I missing?
>
>
> Consider a blist.sortedlist (http://stutzbachenterprises.com/blist/sortedlist.html), or any other such data structure built on a tree, skip list, etc.
>
> The index method is O(log N), so Ram's __contains__ is also O(log N). But naively iterating is obviously O(N). (In fact, it could be worse—if you don't implement a custom __iter__, and your indexing is O(log N), then the naive __contains__ will be O(N log N)…)
>
> Needless to say, blist.sortedlist implements a custom O(log N) __contains__, and so does (hopefully) every other such library on PyPI. But Ram's proposal would mean they no longer have to do so; they'll get O(log N) __contains__ for free just by implementing index.
>
> Of course that only removes one method. For example, they still have to implement a custom count method or they'll get O(N) performance from the default version. If you look at the code for any of these types, __contains__ is a tiny percentage of the implementation. So, it's not a huge win. But it's a small one.
>

What has blist.sortedlist, which IIRC is one of the data structures that 
has been rejected as forming part of the standard library, got to do 
with the default sequence.__contains__ ?

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com




More information about the Python-ideas mailing list