Needless copying in iterations?

Sun Sep 16 19:23:11 EDT 2007

Steven D'Aprano wrote:
> On Sun, 16 Sep 2007 11:49:15 -0400, Steve Holden wrote:
> 
>>> It seems to me that the "consenting adults" philosophy of Python
>>> doesn't extend to the compiler, and perhaps it should. Maybe Python
>>> could optimize common cases, and if developers wanted to do wacky
>>> things, let them turn optimization off on a module-by-module basis.
>>>
>>> Or even function-by-function. Wouldn't it be nice to have decorators
>>> that could optimize the functions they decorated?
>>>
>> No, it would be disastrous, 
> 
> Disastrous, as in thousands dead, panic in the street, martial law 
> declared? I'm sure your program in important, but is it really *that* 
> important? :-)
> 
To me, naturally. I don't expect you to be quite so engaged :-)
> 
>> unless you could manage to implement a
>> mechanism that *told* you when the optimizations were playing havoc with
>> your program's execution.
> 
> How about "the test suite passes when I turn optimization off, but fails 
> when I turn optimization on"? How is that any different from "the test 
> suite passes when I use StringIO, but not if I use cStringIO"?
> 
Well, firstly it appears to mandate test-driven development. While this 
is a meritorious approach, it seems a rather draconian approach to 
language implementation.
> 
>  
>> The problem is that debugging only works if you can assume a
>> deterministic environment. If you are going to put funky optimizations
>> in that break determinism in "little-used" corner cases, then debugging
>> the situations when the optimizations *do* affect program execution will
>> be a complete pain.
> 
> Right, so we're back to the "any pixel is as likely as any other pixel" 
> example from Joel On Software I mentioned earlier. To prevent some 
> hypothetical developer doing some wacky thing from having to work harder 
> to debug some weird corner case, *everybody else* has to miss out on 
> optimizations that could lead to significantly better performance.
> 
> You might think that's a good trade-off. I don't.
> 
Well I have always gone by the principle "First, make it work. Then (if 
it doesn't work fast enough) make it work faster". Correctness is the 
first requirement. But it's naive to assume, even in a test-driven 
environment, that testing can be complete enough to trigger every 
possible pathology of a compiler that is no longer deterministic.

If you want to ask for that kind of trouble then feel free to go ahead, 
but I fear you will have to proceed without me.
> 
>>>>> for item in alist[1:5]:
>>>>>     alist.append(item) # side-effects DON'T matter
>>>> The compiler doesn't know that, at the time of invocation,
>>>> 'alist.append' doesn't have side effects that matter.
>>> It might if it knows what type alist is, and how long it is (as Marc
>>> Rintsch points out). That's why I used "alist" in my example rather
>>> than "an_arbitrary_sequence".
>>>
>> But alist doesn't *have* a type, and static program analysis would have
>> to be extensive to determine that it could only ever be of a specific
>> type.
> 
> Of course alist has a type. Python is a strongly-typed language, not 
> untyped.
> 
No, alist is a name. It's the objects that names refer to that have 
types. This is not a fatuous point, and I am well aware that Python is a 
strongly-typed language. It as, however, also a late-binding language, 
and it's precisely this aspect of its dynamism which confounds the 
solution you suggest.

> And I don't believe I specifically mentioned that static program analysis 
> was the only optimization possible. But even if it was...
> 
> I can *hand optimize* a function that I write, yes? I might *choose* to 
> say to myself, "Screw duck typing, in *my* program function foo will only 
> ever be called with an honest-to-god built-in list argument, I'm going to 
> optimize it for that case". We're all consenting adults here, right?
> 
> Well, what's the difference between me hand optimizing the function, and 
> calling a built-in decorator that does it for me? Why is it okay for me 
> to shoot myself in the foot, but not for me to tell the compiler to shoot 
> me in the foot?
> 
I'm nit sure what you are saying here. Are you implying that you want to 
include type declarations, or somehow impose them by the use of 
decorators? I'm quite happy to allow you to shoot yourself in the foot, 
because you are a consenting adult. However, it's also the 
responsibility of consenting adults to make sure their behavior doesn't 
harm the uninvolved, and I'm not convinced your approach wouldn't harm 
beginner unaware if its significance.
> 
>>> Because of its very nature, optimizing Python is hard. Even something
>>> as simple as tail-call recursion optimization is verboten, because
>>> Guido considers good stack traces more important than preventing stack
>>> overflows in the first place:
>>>
>>> http://www.tratt.net/laurie/tech_articles/articles/
> tail_call_optimization
>> And who's to say he's wrong, especially since Python is intended to be
>> an environment that's friendly to beginners.
> 
> I'm not saying he's *wrong*, I'm saying he's making trade-offs I would 
> like to see relaxed.
> 
Aah, right. Tough. :-)

> But in the specific case of recursion... have you seen the exception you 
> get when the recursion limit is hit? By default, you get 1000 lines of '  
> File "foo", line 2, in function_name' followed by "RuntimeError: maximum 
> recursion depth exceeded". Maybe I'm missing something there, but I don't 
> see how that's much of a help in debugging.
> 
That assumes a single recursive function. If there are a set of four or 
six mutually recursive functions the collapsing the traceback by 
tail-call optimization makes the situation impossible to debug in any 
straightforward way.
> 
>>> Anyway, I'm not really complaining. I like Python, I'm grateful for the
>>> work the Python-dev people have already done, and I'm in no position to
>>> volunteer to write an optimizing compiler. And it may be that,
>>> regardless of how smart you make the compiler, Python simply can't be
>>> optimized because of design decisions made back in 1990-whatever when
>>> Guido first started his grand plan.
>>>
>> There is some of that. The environment is so dynamic that even something
>> as innocuous as attribute lookup can actually involve properties (for
>> example) reading a relational database! I'm not saying that optimization
>> under these circumstances is possible, but that it's much more difficult
>> than is easily anticipated, and that the benefits to be gained might be
>> less than you would think.
> 
> I'm mostly objecting to people who say that Python's compiler "can't" be 
> optimized, which is not true. As I mentioned, there's already a peephole 
> optimizer. There are runtime optimizations possible. Psycho could become 
> a built-in, if the right people wanted it to become a built-in. Etc. With 
> a few million in funding, Python could become as heavily optimized as 
> Common Lisp.
> 
You seem to be unaware that there are organizations with a very large 
financial stake in Python's success. Given that they aren't providing 
this funding, who do you imagine will?

The PyPy project, to my mind one of the most meritorious projects to 
emerge form the infrastructure in the last few years, is the result of 
"a few million in funding", and it shows far more promise for 
speed-related optimization than CPython ever will, IMHO. But there is 
currently a hiatus in their funding and I am unsure where they are 
planning to go from here.

> Of course, whether that would still be the Python we know *now* is 
> another question. But then, with decorators and generator expressions and 
> iterators, Python 2.6 is not precisely the same as the Python we knew 
> back in 1997, is it?
> 
No. Perhaps what you seek is RPython.

regards
  Steve
-- 
Steve Holden        +1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd           http://www.holdenweb.com
Skype: holdenweb      http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline