One Python 2.1 idea

Mon Dec 25 11:25:41 EST 2000

I hope to write more about this in another post, but for
now, just a few short comments.

In article <92722s0l1f at news1.newsguy.com>,
  "Alex Martelli" <aleaxit at yahoo.com> wrote:
> C++ offers lots of very-low-level constructs AS WELL
> AS pretty high-level ones, and, as I said, the low-level
> parts keep coming up then and again even when one is
> trying to work more abstractly ..

To me, this is a BIG reason for preferring Python to
C++. Yes, you can implement any abstraction in C++.
Of course. But they are ADDITIONS to the language. You
cannot decipher a C++ program without also knowing the
lower level abstractions. Adding higher level
abstractions makes the language larger, more complex,
harder to maintain, and less productive. Once again,
I'm going to push the notion that the practical power
of a language is its level of abstraction divided by
its complexity. The reason a lower level language like
C++ will never provide the productivity boost of a
higher level language like Python is precisely that
it has to become more complex as it builds up to the
same level of abstraction.

> RDBMS enjoy a crucial difference: the low-level
> information on how best to optimize (e.g.) a join
> is in full possession of the RDBMS; it does not so
> much depend on what SQL queries I'm using now
> or in the future, but on the physical arrangement of
> tables, indices, &c, which the DB 'knows'. ..

This is even more true for optimizing compilers. They
will recognize access patterns that you don't, and
in contexts where they are not at all obvious. If you
think you can do nearly as a good a job at this as
even a SIMPLE data and control flow analysis, I think
you are over-estimating your abilities, and under-
estimating the benefits of automatic analysis.

> I _don't_ like programs which try to read my mind
> and "do what's best for me whether I want it or not".

Let's be clear that we are talking only about FUNCTIONALLY
IDENTICAL implementations of language abstractions. Are
you really arguing that Python programmers should be
fully cognizant of how each abstraction is implemented?
When you write Python, you keep present in your mind the
memory management strategy for sequences, the lookup,
insertion, slice, and deletion algorithms for strings,
the find, replace, and add algorithms for dictionaries,
the pre-allocation strategy for constant methods and
data members, how regular expressions are compiled, and
the garbage collection strategy for everything? And more,
you keep track of how this changes from point release to
point release? If so, I think C++ might be a better
language for you!

> .. IF I have a semantic constraint about there being
> no deletes on this object, it's a CRUCIAL part of my
> abstraction for it -- I want to state it clearly and
> readably, and let compiler and human readers make of
> my statement what they wish.

But this is NOT what we are talking about. IF it is
important functionally to some list object that there are
no deletes from it, if deleting from it is an application
error, then you SHOULD create a new type, and build in
this constraint, EVEN IF it imposes a performance penalty
to ensure this constraint. But that is NOT the issue. The
issue is the much more common case where it just so
happens that there are no deletes from a list, in the
code as it appears now, and this can be used for
performance optimization. It is NOT part of the
application's semantics, but just a happenstance of
current code.

> .. As the program designer, I *know* whether I intend
> deletes to be 'forbidden' on the object, or if it just
> happens accidentally that there are none YET; ..

And in the far more common case where there are none YET,
that fact can and should be used to improve performance,
until it is no longer easily determined. And there are
MANY patterns like this. Are all insertions at the end?
Are all accesses sequential? Etc. These are all patterns
that can be used for optimization. Note the push for a
special construct for iteration. None of these are
intended as explicit semantics of the application. You
now write:

     for i in m.keys():

Tomorrow, you fix a bug or add functionality, and write:

     tempL = m.keys()
     for i in tempL:
         ..
     for i in tempL:
         ..

Both can be optimized. In neither case is "list is use
only for iteration" nor "list is used for two iterations"
ever a part of what you are trying to achieve in the
application. And if you introduce tempL in the second
version only for performance's sake, that means (a) you're
making assumptions about how byte code is generated that
may fail in version 3.1, (b) you're having to think about
things at a level different from the innate Python
abstractions, and you're productivity is thereby lessened,
and (c) Python 2.1 has failed to provide you adequate
performance. My own preference is to think about things
in terms of Python abstractions, write clear and readable
code, and assume that the underlying abstractions are
implemented efficiently. The last is a bit of a
counter-factual, which is why we are having this
discussion.

Russell

Sent via Deja.com
http://www.deja.com/