better way for ' '.join(args) + '\n'?

Peter Otten __peter__ at web.de
Fri Oct 26 04:58:56 EDT 2012


Ulrich Eckhardt wrote:

> Hi!
> 
> General advise when assembling strings is to not concatenate them
> repeatedly but instead use string's join() function, because it avoids
> repeated reallocations and is at least as expressive as any alternative.
> 
> What I have now is a case where I'm assembling lines of text for driving
> a program with a commandline interface. In this scenario, I'm currently
> doing this:
> 
>    args = ['foo', 'bar', 'baz']
>    line = ' '.join(args) + '\n'
> 
> So, in other words, I'm avoiding all the unnecessary copying, just to
> make another copy to append the final newline.
> 
> The only way around this that I found involves creating an intermediate
> sequence like ['foo', ' ', 'bar', ' ', 'baz', '\n']. This can be done
> rather cleanly with a generator:
> 
>    def helper(s):
>        for i in s[:-1]:
>             yield i
>             yield ' '
>        yield s[-1]
>        yield '\n'
>    line = ''.join(tmp(args))
> 
> Efficiency-wise, this is satisfactory. 

No, it is not. In a quick timeit test it takes 5 to 10 times as long as the 
original. Remember that function calls are costly, and that with s[:-1] you 
are trading the extra string for an extra list. Also, you are doubling the 
loop implicit in str.join() with the explicit one in your oh-so-efficient 
generator.

> However, readability counts and
> that is where this version fails and that is the reason why I'm writing
> this message. So, dear fellow Pythonistas, any ideas to improve the
> original versions efficiency while preserving its expressiveness?
> 
> Oh, for all those that are tempted to tell me that this is not my
> bottleneck unless it's called in a very tight loop, you're right.
> Indeed, the overhead of the communication channel TCP between the two
> programs is by far dwarving the few microseconds I could save here. I'm
> still interested in learning new and better solutions though.

Even if it were the bottleneck the helper generator approach would still be 
unhelpful.






More information about the Python-list mailing list