[Python-Dev] type categories

Guido van Rossum guido@python.org
Tue, 13 Aug 2002 17:15:58 -0400


> While I was driving to work today, I had a thought about the
> iterator/iterable discussion of a few weeks ago.  My impression is
> that that discussion was inconclusive, but a few general principles
> emerged from it:
> 
> 	1) Some types are iterators -- that is, they support calls
> 	   to next() and raise StopIteration when they have no more
> 	   information to give.
> 
> 	2) Some types are iterables -- that is, they support calls
> 	   to __iter__() that yield an iterator as the result.
> 
> 	3) Every iterator is also an iterable, because iterators are
> 	   required to implement __iter__() as well as next().
> 
> 	4) The way to determine whether an object is an iterator
> 	   is to call its next() method and see what happens.
> 
> 	5) The way to determine whether an object is an iterable
> 	   is to call its __iter__() method and see what happens.
> 
> I'm uneasy about (4) because if an object is an iterator, calling its
> next() method is destructive.  The implication is that you had better
> not use this method to test if an object is an iterator until you are
> ready to take irrevocable action based on that test.  On the other
> hand, calling __iter__() is safe, which means that you can test
> nondestructively whether an object is an iterable, which includes
> all iterators.

Alex Martelli introduced the "Look Before You Leap" (LBYL) syndrome
for your uneasiness with (4) (and (5), I might add -- I don't know
that __iter__ is always safe).  He contrasts it with a different
attitude, which might be summarized as "It's easier to ask forgiveness
than permission."  In many cases, there is no reason for LBYL
syndrome, and it can actually cause subtle bugs.  For example, a LBYL
programmer could write

  if not os.path.exists(fn):
    print "File doesn't exist:", fn
    return
  fp = open(fn)
  ...use fp...

A "forgiveness" programmer would write this as follows instead:

  try:
    fp = open(fn)
  except IOError, msg:
    print "Can't open", fn, ":", msg
    return
  ...use fp...

The latter is safer; there are many reasons why the open() call in the
first version could fail despite the exists() test succeeding,
including insufficient permissions, lack of operating resources, a
hard file lock, or another process that removed the file in the mean
time.

While it's not an absolute rule, I tend to dislike interface/protocol
checking as an example of LBYL syndrome.  I prefer to write this:

  def f(x):
    print x[0]

rather than this:

  def f(x):
    if not hasattr(x, "__getitem__"):
      raise TypeError, "%r doesn't support __getitem__" % x
    print x[0]

Admittedly this is an extreme example that looks rather silly, but
similar type checks are common in Python code written by people coming
from languages with stronger typing (and a bit of paranoia).

The exception is when you need to do something different based on the
type of an object and you can't add a method for what you want to do.
But that is relatively rare.

> Here is what I realized this morning.  It may be obvious to you,
> but it wasn't to me (until after I realized it, of course):
> 
>      ``iterator'' and ``iterable'' are just two of many type
>      categories that exist in Python.
> 
> Some other categories:
> 
>      callable
>      sequence
>      generator
>      class
>      instance
>      type
>      number
>      integer
>      floating-point number
>      complex number
>      mutable
>      tuple
>      mapping
>      method
>      built-in

You missed the two that are most commonly needed in practice: string
and file. :-)  I believe that the notion of an informal or "lore" (as
Jim Fulton likes to call it) protocol first became apparent when we
started to use the idea of a "file-like object" as a valid value for
sys.stdout.

> As far as I know, there is no uniform method of determining into which
> category or categories a particular object falls.  Of course, there
> are non-uniform ways of doing so, but in general, those ways are, um,
> nonuniform.  Therefore, if you want to check whether an object is in
> one of these categories, you haven't necessarily learned much about
> how to check if it is in a different one of these categories.
> 
> So what I wonder is this:  Has there been much thought about making
> these type categories more explicitly part of the type system?

I think this has been answered by other respondents.

Interestingly enough, Jim Fulton asked me to critique the Interface
package as it exists in Zope 3, from the perspective of adding
(something like) it to Python 2.3.

This is a descendant of the "scarecrow" proposal,
http://www.foretec.com/python/workshops/1998-11/dd-fulton.html (see
also http://www.zope.org/Members/jim/PythonInterfaces/Summary).

The Zope3 implementation can be viewed here:
http://cvs.zope.org/Zope3/lib/python/Interface/

--Guido van Rossum (home page: http://www.python.org/~guido/)