One Python 2.1 idea

Sun Dec 24 23:28:04 EST 2000

In article <925tak01tpu at news1.newsguy.com>,
  "Alex Martelli" <aleaxit at yahoo.com> wrote:
> I don't think a 10% difference matters substantially, either
> way!  _This_ may be the crux of our disagreement.  Sure,
> if the infrastructure can be enhanced to get a few percents
> here and there, why not.  But that's not going to change my
> programming style, nor make much of a difference to the
> effectiveness of the applications I write.

I agree. And if we were discussing an across the board
10% difference between Python and language X, or of any
one change that had ill repercussions, the discussion
would end there.

But we're not.

Performance improvement very often is a matter of 10%
here, 10% there, and 10% in three other constructs, and
then the two sections of your code that were consuming
95% of execution time, each of which exhibit these five
problems, run twice as fast. And two times IS a
significant improvement. It's not any ONE 10% that
matters. It is their cumulative effect. Cumulative both
over time, as these improvements are introduced, and
cumulative over large programs where such improvements
often are synergistic.

Turpin:
>> One BIG benefit to abstractions like JOIN and list
>> comprehension is that performance improves without the
>> application programmers doing anything. The next release
>> of the programming environment makes everything run

Martelli:
> But why shouldn't said 'next release' be able to optimize
> the 'lower-abstraction' approach, too?  As long as the
> language allows and encourages the coder to express
> the code's intent exactly and precisely, optimizations are
> feasible anyway.  In other words, I do NOT consider
> potential optimizations a 'BIG' benefit of higher-level
> abstractions over lower-level ones.

The benefit of abstraction is that it shifts work from
the programmer to the language implementation. The programmer
says LESS about how to produce a result, but still precisely
specifies what result is needed. Because the implementation
is less constrained, it potentially can choose a faster
execution path. Again, consider the JOIN example. The SQL
compiler has a broad range of choices of how to implement
JOIN and which keys to use. It can make use of significant
data to inform this choice (sometimes including the sizes of
tables and distribution of key data in the target database.)
By using this abstract construct, the application programmer
not only saves the trouble of specifying a particular
implementation route, but likely gets a more efficient route
than he had the knowledge to specify. By going to more detail,
the application programmer can defeat the smart SQL compiler.

Of course, this all depends. Maybe the compiler is dumb,
and the application programmer is pushed to using specific
techniques for the sake of performance. Maybe the compiler
is very smart, and even if the programmer hand codes a JOIN,
it sees that the end result is just a JOIN, that the
intermediate results are not used, and it does the fast
thing anyway. Languages have developed differently in how
they mix responsibilities between application programmer
and language implementation.

I am pushing two principles in this regard. (1) Languages
should provide a powerful set of abstractions as simply
as possible. (2) Application programmers and language
implementors should adhere to a separation of
responsibilities. Application programmers should use the
abstractions that best solve their problem, functionally,
assuming that these are implemented well. Language
implementors should strive to make sure that the
implementation for an abstraction performs well in all
the ways it is used. This route provides significant gains
in productivity. But it only works if BOTH groups take up
their mantle.

> When I compare working in Python with working in
> Haskell, say, or perl, or, I _think_, ML dialects (not
> enough practical real-world experience to be sure!),
> it does not seem to me that I'm working at a higher
> level of abstraction; the abstraction-level seems quite
> comparable.  (Actually, this goes for C++, as well,
> most of the time -- but, admittedly, lower-level issues
> _do_ keep interfering often enough in that case).

Perl provides abstractions of similar power, but has
greater language comlexity. In some sense, the practical
power of a language is its level of abstraction divided
by its innate complexity. That denominator puts Perl
much lower on the totem pole than Python.

C++ does NOT provide the same level of abstraction as
Python. It does not have sequences, dictionaries, nor
memory management as built in parts of the language.
Raw pointers are exposed. Classes are not first class
objects. Instances are statically composed. There is
no run-time compilation of code. All this is friction
and detail I am happy to leave behind. I do not have
enough familiarity with the other two to speak. To me,
one of the strong attractions of Python is the notion
of power I just described: it has a very powerful and
practical set of abstractions, while remaining simple.

> But if Python had such a 'listbuffer' object, and
> used it internally for list-comprehensions, I'd
> _also_ like to see it exposed for direct programmer
> use... why not? .. Say I want to build up and return
> a list of Fibonacci numbers, however many are needed
> to get up to the first one that is larger than an
> argument N.  The simplest approach:
>
> def listFib(N):
>     if N<1: return [1]
>     result = [1, 1]
>     next = 1
>     while next <= N
>         next = result[-1]+result[-2]
>         result.append(next)
>     return result
>
> .. if listbuffer objects existed, I might get some
> substantial performance gain through them, without
> (IMHO) affecting readability much, if at all:
>
> def listFib(N):
>     if N<1: return [1]
>     result = listBuffer.listBuffer(1, 1)
>     next = 1
>     while next <= N
>         next = result[-1]+result[-2]
>         result.append(next)
>     return list(result)

I think this approach is misguided. A listBuffer, as
far as I can tell, is functionally identical to a
sequence. We've cluttered the language with another
abstraction, whose only purpose is to help the compiler
decide an implementation. Now, everyone learning Python
has to learn about listBuffers, in order to read your
code. This same mistake has been made already with
xrange(). Given enough time, we'll have to teach
programmers about a dozen different things that all
behave like Python sequences. Blech!

In my mind, the ideal is very different. The Python
implementation should have a dozen or more ways of
implementing sequences, depending on how they are
constructed, what kinds of objects are put into them,
the contexts in which they are indexed, whether they
are used for iteration, etc. The vast majority of the
time, these things can be determined at compile time.
There shouldn't be a different language abstraction
for sequences that are not stored. xrange() is an
abomination. The compiler should simply use this
implementation when it fits.

ONLY ONE SEQUENCE ABSTRACTION SHOULD BE EXPOSED TO THE
APPLICATION PROGRAMMER. It would be a mistake -- a BIG
mistake -- to continue the process of cluttering up
the language with new abstractions, merely to give
implementation directive to the compiler. I vote we
reverse this process, by deprecating xrange(). IF we
want implementation directives, they should be
syntactically distinct annotations, appended to the
relevant code:

    stopWords = {}     #PYDIR: nodeletes

At least that way, programmers reading and debugging
the code can (in theory) ignore the annotation. (The
compiler can also ignore the directive, if it is wrong
headed.) In my opinion, you get far more bang for the
buck if the compiler makes these optimizations
automatically.

(Apologetic prologue: This mind dump is occuring after
Christmas eve celebrations. I hope everyone reads it
generously, and that I am not too embarrassed when I
read it tomorrow.)

Russell

Sent via Deja.com
http://www.deja.com/