Needless copying in iterations?

Sun Sep 16 11:49:15 EDT 2007

Steven D'Aprano wrote:
> On Sun, 16 Sep 2007 15:05:55 +1000, Ben Finney wrote:
> 
>> Steven D'Aprano <steve at REMOVE-THIS-cybersource.com.au> writes:
>>
>>> In *general* the compiler can't tell, but in specific cases it could. A
>>> (hypothetical) optimizing compiler would tell the difference between:
>>>
>>> for item in alist[1:5]:
>>>     print item # no possible side-effects
>> The 'print' statement converts the 'item' to a str. That conversion
>> could, in a pathological case, have a side-effect (say, if the class of
>> 'item' had an overridden '__str__' method with side effects), and the
>> compiler can't know this isn't a pathological case.
> 
> Fair enough... but I'm reminded of a rant by Joel Spolsky about GUI 
> design:
> 
> '...programmers have tended to think that if users are allowed to resize 
> and move windows, they should have complete flexibility over where these 
> windows go, right down to the last pixel. After all, positioning a window 
> 2 pixels from the top of the screen is "equally likely" as positioning a 
> window exactly at the top of the screen.'
> 
> http://www.joelonsoftware.com/uibook/fog0000000249.html
> 
> (three quarters of the way down, in Chapter 7.)
> 
> Maybe I'm being unfair, but it seems to me that the attitude is similar: 
> 'there's no point optimizing the common case of printing (say) ints 
> stored in a list, Just In Case the programmer wants the incredibly rare 
> case of setting sys.stdout to some wacky object that modifies the list 
> he's iterating over. It could happen.'
> 
> *shrug* But apart from the occasional smart-alec who does it just to 
> demonstrate that it is possible, it probably never has.
> 
> It seems to me that the "consenting adults" philosophy of Python doesn't 
> extend to the compiler, and perhaps it should. Maybe Python could 
> optimize common cases, and if developers wanted to do wacky things, let 
> them turn optimization off on a module-by-module basis.
> 
> Or even function-by-function. Wouldn't it be nice to have decorators that 
> could optimize the functions they decorated?
> 
No, it would be disastrous, unless you could manage to implement a 
mechanism that *told* you when the optimizations were playing havoc with 
your program's execution.

The problem is that debugging only works if you can assume a 
deterministic environment. If you are going to put funky optimizations 
in that break determinism in "little-used" corner cases, then debugging 
the situations when the optimizations *do* affect program execution will 
be a complete pain.

> 
>>> for item in alist[1:5]:
>>>     alist.append(item) # side-effects DON'T matter
>> The compiler doesn't know that, at the time of invocation,
>> 'alist.append' doesn't have side effects that matter.
> 
> It might if it knows what type alist is, and how long it is (as Marc 
> Rintsch points out). That's why I used "alist" in my example rather than 
> "an_arbitrary_sequence".
> 
But alist doesn't *have* a type, and static program analysis would have 
to be extensive to determine that it could only ever be of a specific type.
> 
>> The compiler for a dynamic language like Python has very little absolute
>> "no significant side-effect in these specific cases" guarantee of the
>> kind you're assuming even in the cases you choose for contrast with the
>> ones that *do* have significant side-effects.
> 
> Yes, in general one might not be able to make those guarantees, but still 
> there are common cases where you can. That's how Psycho works.
> 
Yes, but psycho doesn't use static program analysis but instead uses 
run-time environment examination to determine whether specific 
optimizations can be applied.

> The point is rather moot, since CPython (and probably other Pythons) do 
> almost no optimizations. But just because Python is a dynamic language 
> doesn't mean there are no optimizations possible: Haskell is a dynamic 
> language, and there are optimizing compilers for it. Of course, it is 
> much simpler for Haskell, because of the type system it uses.
> 
> The CPython compiler already does a couple of source-code optimizations 
> (ignores some unused strings; some constant folding) and at least one 
> runtime optimization (string concatenation is no longer _always_ slow):
> 
> http://utcc.utoronto.ca/~cks/space/blog/python/ExaminingStringConcatOpt
> 
> Because of its very nature, optimizing Python is hard. Even something as 
> simple as tail-call recursion optimization is verboten, because Guido 
> considers good stack traces more important than preventing stack 
> overflows in the first place:
> 
> http://www.tratt.net/laurie/tech_articles/articles/tail_call_optimization
> 
And who's to say he's wrong, especially since Python is intended to be 
an environment that's friendly to beginners.

> Anyway, I'm not really complaining. I like Python, I'm grateful for the 
> work the Python-dev people have already done, and I'm in no position to 
> volunteer to write an optimizing compiler. And it may be that, regardless 
> of how smart you make the compiler, Python simply can't be optimized 
> because of design decisions made back in 1990-whatever when Guido first 
> started his grand plan.
> 
There is some of that. The environment is so dynamic that even something 
as innocuous as attribute lookup can actually involve properties (for 
example) reading a relational database! I'm not saying that optimization 
under these circumstances is possible, but that it's much more difficult 
than is easily anticipated, and that the benefits to be gained might be 
less than you would think.

regards
  Steve
-- 
Steve Holden        +1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd           http://www.holdenweb.com
Skype: holdenweb      http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline