Performance of list vs. set equality operations

Thu Apr 8 19:35:15 EDT 2010

En Thu, 08 Apr 2010 04:07:53 -0300, Steven D'Aprano  
<steven at remove.this.cybersource.com.au> escribió:
> On Wed, 07 Apr 2010 20:14:23 -0700, Raymond Hettinger wrote:
>> [Raymond Hettinger]

>>> > If the two collections have unequal sizes, then both ways immediately
>>> > return unequal.
>>
>> [Steven D'Aprano]
>>> Perhaps I'm misinterpreting what you are saying, but I can't confirm
>>> that behaviour, at least not for subclasses of list:
>>
>> For doubters, see list_richcompare() in
>> http://svn.python.org/view/python/trunk/Objects/listobject.c?
> revision=78522&view=markup
>
> So what happens in my example with a subclass that (falsely) reports a
> different length even when the lists are the same?
>
> I can guess that perhaps Py_SIZE does not call the subclass __len__
> method, and therefore is not fooled by it lying. Is that the case?

Yes. Py_SIZE is a generic macro, it returns the ob_size field from the  
object structure. No method is called at all.

Another example: the print statement bypasses the sys.stdout.write()  
method and calls directly fwrite() at the C level when it determines that  
sys.stdout is a `file` instance. Even if it's a subclass of file, so  
overriding write() in Python code does not work.

The CPython source contains lots of shortcuts like that. Perhaps the  
checks should be stricter in some cases, but I imagine it's not so easy to  
fix: lots of code was written in the pre-2.2 era, assuming that internal  
types were not subclassable.

-- 
Gabriel Genellina