New-style classes, iter and PySequence_Check

Bengt Richter bokr at oz.net
Thu Apr 14 21:37:16 EDT 2005


On Thu, 14 Apr 2005 15:18:20 -0600, Steven Bethard <steven.bethard at gmail.com> wrote:

>Tuure Laurinolli wrote:
>> Someone pasted the original version of the following code snippet on 
>> #python today. I started investigating why the new-style class didn't 
>> work as expected, and found that at least some instances of new-style 
>> classes apparently don't return true for PyInstance_Check, which causes 
>> a problem in PySequence_Check, since it will only do an attribute lookup 
>>  for instances.
>> 
>> Things probably shouldn't be this way. Should I go to python-dev with this?
>> 
>> Demonstration snippet:
>
>For anyone who's curious, here's what the code actually does:
>
>py> args={'a':0}
>py> class Args(object):
>...     def __getattr__(self,attr):
>...         print "__getattr__:", attr
>...         return getattr(args,attr)
>...
>py> class ClassicArgs:
>...     def __getattr__(self, attr):
>...         print "__getattr__:", attr
>...         return getattr(args, attr)
>...
>py> c = ClassicArgs()
>py> i = c.__iter__()
>__getattr__: __iter__
>py> print i
><dictionary-keyiterator object at 0x0115D920>
>py> i = iter(c)
>__getattr__: __iter__
>py> print i
><dictionary-keyiterator object at 0x01163CA0>
>py> a = Args()
>py> i = a.__iter__()
>__getattr__: __iter__
>py> print i
><dictionary-keyiterator object at 0x01163D20>
>py> i = iter(a)
>Traceback (most recent call last):
>   File "<interactive input>", line 1, in ?
>   File "D:\Steve\My Programming\pythonmodules\sitecustomize.py", line 
>37, in iter
>     return orig(*args)
>TypeError: iteration over non-sequence
>
I think this is a known thing about the way several builtin functions like iter work.
I'm not sure, but I think it may be expedient optimization practicality beating purity.
IOW, I think maybe iter(a) skips the instance logic altogether and goes right to
type(a).__iter__(a) instead of looking for something on the instance shadowing __iter__.
And I suspect iter(a) does its own internal mro chase for __iter__, bypassing even a
__getattribute__ in a metaclass of Args, as it appears if you try to monitor that way. E.g.,

 >>> class Show(object):
 ...     class __metaclass__(type):
 ...         def __getattribute__(cls, attr):
 ...             print 'Show.__getattribute__:', attr
 ...             return type.__getattribute__(cls, attr)
 ...     def __getattribute__(self, attr):
 ...         print 'self.__getattribute__:', attr
 ...         return object.__getattribute__(self, attr)
 ...
 ...
 >>> show = Show()
 >>> Show.__module__
 Show.__getattribute__: __module__
 '__main__'
 >>> show.__module__
 self.__getattribute__: __module__
 '__main__'
 >>> show.__iter__
 self.__getattribute__: __iter__
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "<stdin>", line 8, in __getattribute__
 AttributeError: 'Show' object has no attribute '__iter__'
 >>> Show.__iter__
 Show.__getattribute__: __iter__
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "<stdin>", line 5, in __getattribute__
 AttributeError: type object 'Show' has no attribute '__iter__'

But no interception this way:
 >>> iter(show)
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
 TypeError: iteration over non-sequence


Naturally __getattr__ gets bypassed too, if there's no instance attribute lookup even tried.

If we actually supply an __iter__ method for iter(a) to find as type(a).__iter__, we can see:

 >>> class Args(object):
 ...     def __getattr__(self, attr):
 ...         print "__getattr__:", attr
 ...         return getattr(args, attr)
 ...     def __iter__(self):
 ...         print "__iter__"
 ...         return iter('Some kind of iterator')
 ...
 >>> a = Args()
 >>> a.__iter__
 <bound method Args.__iter__ of <__main__.Args object at 0x02EF8BEC>>
 >>> a.__iter__()
 __iter__
 <iterator object at 0x02EF888C>
 >>> iter(a)
 __iter__
 <iterator object at 0x02EF8CAC>
 >>>
 >>> type(a).__iter__
 <unbound method Args.__iter__>
 >>> type(a).__iter__(a)
 __iter__
 <iterator object at 0x02EF8CAC>

Now if we get rid of the __iter__ method, __getattr__ can come into play again:

 >>> del Args.__iter__
 >>> a.__iter__
 __getattr__: __iter__
 <method-wrapper object at 0x02EF8C0C>
 >>> a.__iter__()
 __getattr__: __iter__
 <dictionary-keyiterator object at 0x02EF8CE0>
 >>> type(a).__iter__
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
 AttributeError: type object 'Args' has no attribute '__iter__'
 >>> iter(a)
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
 TypeError: iteration over non-sequence

It's kind of like iter(a) assuming that type(a).__iter__ is a data descriptor
delivering a bound method instead of an ordinary method, and therefore it can
assume that the data descriptor always trumps instance attribute lookup, and
it can optimize with that assumption. I haven't walked the relevant iter() code.
That would be too easy ;-)

Regards,
Bengt Richter



More information about the Python-list mailing list