Thoughts on Guido's ITC audio interview

Sun Jun 26 01:30:59 EDT 2005

On Sat, 25 Jun 2005 19:31:20 -0600, John Roth wrote:

> 
> "Dave Benjamin" <ramen at lackingtalent.com> wrote in message 
> news:slrndbrse5.71d.ramen at lackingtalent.com...
>> Guido gave a good, long interview, available at IT Conversations, as was
>> recently announced by Dr. Dobb's Python-URL! The audio clips are available
> 
> [snip]
> 
>>  - Java: the usual static vs. dynamic, static analysis vs. unit testing
>>    arguments. Guido says that there's such a small amount of problems that
>>    can be caught at compile time, and even the smallest amount of unit
>>    testing would also catch these problems. "The blindness of that 
>> [static]
>>    position... escapes me."
> 
> Three years ago, this was a viable arguement. Two years ago, it was
> beginning to show difficulties. Today anyone who makes it can be
> accused of having their head in the sand [1].
> 
> What's being ignored is that type information is useful for other things
> than compile type checking. The major case in point is the way IDEs
> such as IntelliJ and Eclipse use type information to do refactoring, code
> completion and eventually numerous other things. A Java programmer
> using IntelliJ or Eclipse can eliminate the advantage that Python
> used to have, and possibly even pull ahead.

I haven't used IntelliJ or Eclipse, so I guess I'll have to take your word
for how wonderful they are.

[snip]
> I'll throw out one very simple example in the string library. The
> index() and find() methods, and their variants, suffer from a bad case
> of non-obviousness inherited from the C library. 

It must be an very bad case of non-obviousness indeed, because it isn't
obvious to me at all what particular bit of non-obviousness it is that
you are referring to.

> A very simple
> replacement would fix this.
> 
> --------------------------------
> 
> findall([maxfind,] sub, [start, [end]])
> 
> findall returns a list (possibly a tuple) of the beginning index of each
> substring which matches sub. If there are no matches, an empty list is
> returned. At most maxfind indexes are returned. start and end have the
> same meaning as in the existing find and index methods.

Am I the only one who has reservations about having to build a list of all
the matches before doing anything with them?

s = "a" * 10000000  # 10 MB of data to search
L = s.findall("a")  # lots of matches means L is very large

or imagine a more complex case where I have to programmatically change my
search string as I go. I might not know how many matches I need until I
have found them and can analyse the text around the match.

Seems to me that rather than having to find all matches regardless of what
you actually want, it would be better to turn find into a generator. Then
findall becomes list(find) and you still have the flexibility of finding
matches one at a time.

> This version works intuitively with if statements (an empty list is
> false), goes directly into for statements, can be implemented as an
> iterator, and is only less efficient by a constant (creation of the
> list) if the maxfind parameter is used. It never throws an exception
> (unless the parameter list has the wrong types). Its an example of a
> method that can be used cleanly with a type inferencer, while the index
> and find methods cannot.

Er, how does that last one work? How can findall be used cleanly? Why
can't find?

-- 
Steven.