Thoughts on Guido's ITC audio interview

Sun Jun 26 08:11:37 EDT 2005

On Sat, 25 Jun 2005 23:49:40 -0600, John Roth wrote:

>>> What's being ignored is that type information is useful for other
>>> things than compile type checking. The major case in point is the way
>>> IDEs such as IntelliJ and Eclipse use type information to do
>>> refactoring, code completion and eventually numerous other things. A
>>> Java programmer using IntelliJ or Eclipse can eliminate the advantage
>>> that Python used to have, and possibly even pull ahead.
>>
>> I haven't used IntelliJ or Eclipse, so I guess I'll have to take your
>> word for how wonderful they are.
> 
> You might want to look at something outside of Python. The world
> changes, and if you keep your eyes closed, you might not notice it
> changing until it rolls over you.

If and when I have personal experience with either of these IDEs, I'll
be sure to let you know if I disagree with you. But until then, since I
have no reason to doubt what you say, and life is too short to check
everything in the world, I will accept your opinion on IntelliJ and
Eclipse. 

>> [snip]
>>> I'll throw out one very simple example in the string library. The
>>> index() and find() methods, and their variants, suffer from a bad case
>>> of non-obviousness inherited from the C library.
>>
>> It must be an very bad case of non-obviousness indeed, because it isn't
>> obvious to me at all what particular bit of non-obviousness it is that
>> you are referring to.
> 
> The result of find() cannot be used cleanly in an if statement; you need
> to compare it to -1. This is not obvious to a novice, and is a fertile
> source of mistakes. It's an irregularity that has to be checked for.

Oh, that? It was obvious to me from the moment I understood that strings
were indexed from 0, not 1. How else could it work?

find() doesn't return a truth-value, it returns an index. You can't
sensibly use that index where Python expects a truth-value, but then you
also can't use it where Python expects a dict or a tuple or a function. I
realised that in about 5 seconds, was disappointed that Python used
0-based indexing rather than 1-based, and got over it. I never once even
tried passing the index returned by find() to if.

Yes, a lot of novices make that mistake. Shame on them. I've made plenty
of stupid mistakes in my time, and no doubt I will continue to do so. When
I make a stupid mistake, I am ashamed at MY stupid mistake, and don't
blame the language for my failures.

(I have writen p = S.find(substr); if p >= -1: in error.)

[snip]
>> Am I the only one who has reservations about having to build a list of
>> all the matches before doing anything with them?
> 
> You don't have to build a list of all the matches. First, that's what
> the maxfind parameter is all about, second, as I said below, it could be
> implemented as an iterator. I'd expect that to happen if it was going to
> be used in a for statement.

Firstly, I've already noted that sometimes you can't predict before hand
how many matches you need, so maxfind doesn't always help.

And as for it being an iterator, I quote your specification for what
findall should do:

"findall returns a list (possibly a tuple) of the beginning index of each
substring which matches sub. If there are no matches, an empty list is
returned."

Whether it is implemented as an iterator or not is irrelevant: it still
has to package up all those yields and stick them in a list before
returning that list.

I believe there is a case for turning find() into an iterator. There might
even be a good case to make for a function findall. But I think it is a
terrible idea to replace find() with a function which "returns a list of
the beginning index of each substring which matches".

You are welcome to change the specifications of findall() and turn it into
an iterator which returns each match one at a time instead of all at once,
but then the name is misleading, wouldn't you agree?

>>> This version works intuitively with if statements (an empty list is
>>> false), goes directly into for statements, can be implemented as an
>>> iterator, and is only less efficient by a constant (creation of the
>>> list) if the maxfind parameter is used. It never throws an exception
>>> (unless the parameter list has the wrong types). Its an example of a
>>> method that can be used cleanly with a type inferencer, while the
>>> index and find methods cannot.
>>
>> Er, how does that last one work? How can findall be used cleanly? Why
>> can't find?
> 
> findall() has a definite type: a list of integers, specifically a list
> of integers
> that are legitimate indexes into a specific string. 

Not according to your specification. It can also return an empty list.
There are no legitimate indexes in an empty list. If your IDE or compiler
or other tool can deal with that special case, why can't it deal with the
special case of find returning -1?

In any case, there is no such type as "legitimate indexes", and there
can't be for arbitrary strings. In general, you don't know the length of
either the substring or the main string until runtime. Since you don't
know both lengths, you can't tell which ints are legitimate indexes and
which are not.

For arbitrary strings, length unknown until runtime, the best you can do
is allow any index > -1. And if you are going to do that, then it is not
really such a big deal to allow any index >= -1 instead.

Yes, an index of -1 has a different meaning than 0 or 1 or 2, but in the
findall case, a return result of [] has a different meaning than a return
result of [0, 1, 2]. You are just exchanging one special case for another
special case.

> The result of find() does
> not have this property: it can be an integer that is an index into the
> string,
> or it can be a -1. 

Both of which are ints. What you are talking about isn't really TYPE
checking, but more like VALUE checking.

> Index can either return an index into the string, or
> it can throw an exception. Both of these are complex result types that
> hinder further type inference.

Why? index doesn't return an exception, it raises it. The return value
is always a non-negative int. Any function can in principle raise an
exception. If "can raise an exception" is enough to confuse the
type-checking tools, then you have some serious trouble.

-- 
Steven.