[Python-ideas] Implicit string literal concatenation considered harmful?

Sat May 11 09:13:27 CEST 2013

On May 10, 2013, at 22:53, Steven D'Aprano <steve at pearwood.info> wrote:

> On 11/05/13 15:12, Andrew Barnert wrote:
> 
>> Why does it need to be compile time? Do people really run into cases that frequently where the cost of concatenating or dedenting strings at import time is significant?
> 
> 
> String constants do not need to be concatenated only at import time.
> 
> Strings frequently need to be concatenated at run-time, or at function call time, or inside loops. For constants known at compile time, it is better to use a string literal rather than a string calculated at run-time for the same reason that it is better to write 2468 rather than 2000+400+60+8 -- because it better reflects the way we think about the program, not just because of the run-time expense of extra unnecessary additions/concatenations.

Well, you have the choice of either:

count = 2000 + 400 + 60 + 8
for e in hugeiter:
    foo(e, count)

Or:

for e in hugeiter:
    foo(e, 2468) # 2000 + 400 + 60 + 8

And again, considering that the whole point of string concatenation is dealing with cases that are hard to fit into 80 cols otherwise, the former option is, if anything, even more appropriate.

>> If so, it seems like something more dramatic might be warranted, like allowing the compiler to assume that method calls on literals have the same effect at compile time as at runtime so it can turn them into constants.
> 
> In principle, the keyhole optimizer could make that assumption. In practice, there is a limit to how much effort people put into the optimizer. Constant-folding method calls is probably past the point of diminishing returns.

Adding new optimizations just for the hell of it is obviously not a good idea. But we're talking about the cost of adding an optimization to vs. adding a new type of auto-dedenting string literal. It seems like about the same scope either way, and the former doesn't require any changes to the grammar, docs, other implementations, etc.--or, more importantly, existing user code. And it might even improve other related cases.

If the problem is so important we're seriously considering changing the syntax, it seems a little unwarranted to reject the optimization out of hand. Or, contrarily, if the optimization is obviously not worth doing, changing the syntax to let people do the same optimization manually seems excessive.