Thoughts on Guido's ITC audio interview

Sun Jun 26 01:49:40 EDT 2005

"Steven D'Aprano" <steve at REMOVETHIScyber.com.au> wrote in message 
news:pan.2005.06.26.05.30.58.959708 at REMOVETHIScyber.com.au...
> On Sat, 25 Jun 2005 19:31:20 -0600, John Roth wrote:
>
>>
>> "Dave Benjamin" <ramen at lackingtalent.com> wrote in message
>> news:slrndbrse5.71d.ramen at lackingtalent.com...
>>> Guido gave a good, long interview, available at IT Conversations, as was
>>> recently announced by Dr. Dobb's Python-URL! The audio clips are 
>>> available
>>
>> [snip]
>>
>>>  - Java: the usual static vs. dynamic, static analysis vs. unit testing
>>>    arguments. Guido says that there's such a small amount of problems 
>>> that
>>>    can be caught at compile time, and even the smallest amount of unit
>>>    testing would also catch these problems. "The blindness of that
>>> [static]
>>>    position... escapes me."
>>
>> Three years ago, this was a viable arguement. Two years ago, it was
>> beginning to show difficulties. Today anyone who makes it can be
>> accused of having their head in the sand [1].
>>
>> What's being ignored is that type information is useful for other things
>> than compile type checking. The major case in point is the way IDEs
>> such as IntelliJ and Eclipse use type information to do refactoring, code
>> completion and eventually numerous other things. A Java programmer
>> using IntelliJ or Eclipse can eliminate the advantage that Python
>> used to have, and possibly even pull ahead.
>
> I haven't used IntelliJ or Eclipse, so I guess I'll have to take your word
> for how wonderful they are.

You might want to look at something outside of Python. The world
changes, and if you keep your eyes closed, you might not notice
it changing until it rolls over you.

> [snip]
>> I'll throw out one very simple example in the string library. The
>> index() and find() methods, and their variants, suffer from a bad case
>> of non-obviousness inherited from the C library.
>
> It must be an very bad case of non-obviousness indeed, because it isn't
> obvious to me at all what particular bit of non-obviousness it is that
> you are referring to.

The result of find() cannot be used cleanly in an if statement; you
need to compare it to -1. This is not obvious to a novice, and is
a fertile source of mistakes. It's an irregularity that has to be checked
for.

>
>> A very simple
>> replacement would fix this.
>>
>> --------------------------------
>>
>> findall([maxfind,] sub, [start, [end]])
>>
>> findall returns a list (possibly a tuple) of the beginning index of each
>> substring which matches sub. If there are no matches, an empty list is
>> returned. At most maxfind indexes are returned. start and end have the
>> same meaning as in the existing find and index methods.
>
> Am I the only one who has reservations about having to build a list of all
> the matches before doing anything with them?

You don't have to build a list of all the matches. First, that's what the
maxfind parameter is all about, second, as I said below, it could be
implemented as an iterator. I'd expect that to happen if it was going
to be used in a for statement.

>
> s = "a" * 10000000  # 10 MB of data to search
> L = s.findall("a")  # lots of matches means L is very large
>
> or imagine a more complex case where I have to programmatically change my
> search string as I go. I might not know how many matches I need until I
> have found them and can analyse the text around the match.
>
> Seems to me that rather than having to find all matches regardless of what
> you actually want, it would be better to turn find into a generator. Then
> findall becomes list(find) and you still have the flexibility of finding
> matches one at a time.

See below. I covered it.

>> This version works intuitively with if statements (an empty list is
>> false), goes directly into for statements, can be implemented as an
>> iterator, and is only less efficient by a constant (creation of the
>> list) if the maxfind parameter is used. It never throws an exception
>> (unless the parameter list has the wrong types). Its an example of a
>> method that can be used cleanly with a type inferencer, while the index
>> and find methods cannot.
>
> Er, how does that last one work? How can findall be used cleanly? Why
> can't find?

findall() has a definite type: a list of integers, specifically a list of 
integers
that are legitimate indexes into a specific string. The result of find() 
does
not have this property: it can be an integer that is an index into the 
string,
or it can be a -1. Index can either return an index into the string, or it
can throw an exception. Both of these are complex result types that
hinder further type inference.

John Roth

>
>
> -- 
> Steven.
>