how to join array of integers?

Steve Holden steve at holdenweb.com
Mon Sep 17 14:38:18 EDT 2007


Paul Rudin wrote:
> Steven D'Aprano <steve at REMOVE-THIS-cybersource.com.au> writes:
> 
>> On Sun, 16 Sep 2007 19:25:22 +0100, Paul Rudin wrote:
>>
>>>> The generator expression takes about twice as long to run, and in my
>>>> opinion it is no more readable. So what's the advantage?
>>> If you do it with a decent size list they take more or less the same
>>> time. 
>> Did you try it, 
> 
> Yes.
> 
>> or are you guessing? 
> 
> Well - I guessed first, otherwise it wouldn't have been worth trying :)
> 
>> What do you call a decent size?
> 
> I used 10,000.
> 
>>
>>> You'd presumably expect the generator to use less memory; which
>>> might be an advantage if you have large lists.
>> Unfortunately, Python doesn't make it as easy to measure memory use as it 
>> does to time snippets of code, so that's just a hypothetical.
> 
> Well - as it turns out the list gets made anyway by the join method,
> so in this case there's probably little difference. However there
> (obviously) are siturations where a generator is going to save you a
> significant memory over the similar list comprehension.
> 
>>
>>> Isn't it odd that the generator isn't faster, since the comprehension
>>> presumably builds a list first and then iterates over it, whereas the
>>> generator doesn't need to make a list?
>> Who says it doesn't need to make a list? string.join() needs a sequence.
>>
> 
> The generator doesn't make a list. The implementation of str.join does
> apparently needs a sequence and so makes a list from the generator
> passed to it, which is presumably why you get essentially the same
> performance once you factor out setup noise for small lists.
> 
> Although it's not clear to me why the join method needs a sequence
> rather than just an iterator.

I suspect it's to do with making memory allocation easier, though I 
didn't read the whole gory detail in stringobject.c. Without a 
pre-existing set of joinable items you have to build the string up using 
some sort of allocate-and-reallocate strategy, whereas if you have the 
joinables already in a sequence you can pre-compute how much memory the 
result will need.

Ah, right, it's in the comment:

          * Do a pre-pass to figure out the total amount of space we'll
          * need (sz), see whether any argument is absurd, and defer to
          * the Unicode join if appropriate.

regards
  Steve
-- 
Steve Holden        +1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd           http://www.holdenweb.com
Skype: holdenweb      http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline




More information about the Python-list mailing list