Cryptographically random numbers

Tue Mar 7 19:45:09 EST 2006

Paul Rubin wrote:
> "Tuvas" <tuvas21 at gmail.com> writes:
> 
>>I've actually done the tests on this one, it's actually faster to use
>>the += than a list, odd as it may sound. 
> 
> 
> Frederik explained the reason; there's an optimization in Python 2.4
> that I'd forgotten about, for that specific case.  It's not in earlier
> versions.  It's a bit fragile in 2.4:
> 
>   a = ''
>   for x in something:
>     a += g(x)
> 
> is fast, but if a is aliased, Python can't do the optimization, so 
> 
>   a = ''
>   b = a
>   for x in something:
>     a += g(x)
> 
> is slow.  

Is this really true? After the first time through the loop, 'a' won't be 
aliased any more since strings are immutable. After that the loops 
should be equivalent. I tried this out to see if I could see a timing 
difference, in case I was missing something, with Python 2.4.1, the 
following two snippets timed essentially the same for N up to 2**20 (I 
didn't try any higher):

def concat1():
     a = ''
     for x in ' '*N:
         a += x
     return a

def concat2():
     a = ''
     b = a
     for x in ' '*N:
         a += x
     return a

Regards,

-tim

> Figuring out which case to use relies on CPython's reference
> counting storage allocator (the interpreter keeps track of how many
> pointers there are to any given object) and so the optimization may
> not be feasible at all in other implementations which use different
> storage allocation strategies (e.g. Lisp-style garbage collection).
> 
> All in all I think it's best to use a completely different approach
> (something like StringBuffer) but my effort to start a movement here
> on clpy a couple months ago didn't get anywhere.