Possible memory leak?

Wed Jan 25 22:18:02 EST 2006

On Wed, 25 Jan 2006 20:08:56 +0100, Giovanni Bajo wrote:

> Steven D'Aprano wrote:
> 
>> No, that's not correct. You are making a false
>> assumption.
> 
> Did you ever try to measure the code yourself?

I'm still running Python 2.3, it would give misleading results.

>> This is from the "What's New" for Python 2.4:
>>
>> [quote]
>> String concatenations in statements of the form s = s +
>> "abc" and s += "abc" are now performed more efficiently
>> in certain circumstances. This optimization won't be
>> present in other Python implementations such as Jython,
>> so you shouldn't rely on it; using the join() method of
>> strings is still recommended when you want to
>> efficiently glue a large number of strings together.
>> [end quote]
>>
>> Note the "more efficiently in CERTAIN CIRCUMSTANCES"
>> [emphasis added]? That certainly does not mean that
>> concatenating large numbers of strings is no longer
>> slow. It just means that *sometimes* (almost always?
>> often? occasionally?) it isn't as slow as it used to be.
>>
>> We really don't know what the optimization recognises,
>> how it works, or how fast a difference it makes.
>> Writing poor code, and trusting an implementation-
>> specific optimization to make it run "faster" (how much
>> faster?) is always a dangerous strategy.
> 
> The code is poor by your definition of poor. In my definition, it used to be
> poor and it's not anymore. Since I was interested in exactly WHICH
> circumstances the optimization was performed, I investigated and I know the
> answer. I also know that the optimization *will* apply to the OP code. 

I think you are being overly optimistic about the OP's actual code.

It looks to me that the code posted can't possibly be his production code.
Both versions he has posted lack a return statement. He isn't posting his
actual code, only a modified version of it. He's even admitted it: "Note,
a few small changes have been made to simplify things, however, these
things don't apply to a full-scale picture, so the shouldn't slow anything
down in the slightest." Famous last words. If I had a dollar for every
time somebody said "I made some changes, but they shouldn't change
anything"...

> But maybe you want some numbers:
... 
> So, look, it's even faster than the solution you're proposing.

But your test code isn't equivalent to the OP's code. I don't know if it
will make a difference, but his code looks more like this:

def iters3(n,m):
    data = ''
    for i in xrange(n):
        row = ''
        for j in xrange(m):
            row += chr(j%64)
        data += row
    return data

while yours is:

def iters(n):
    s = ''
    for i in xrange(n):
        s += chr(i%64)
    return s

I'm happy to agree that your code is optimized to the point it is
marginally faster than the list/join algorithm. But does iters3 optimize
as well? I don't know.

Even if it does, what generalisation do you learn from that? What do you
predict about this one?

def iters4(n):
    s = ''
    D = {}
    for i in xrange(n):
        s += chr(i%64)
        D[i] = s
    return s

At what level of complexity should the Python coder stop relying on
compiler optimizations to turn slow code into fast?

You have saved a handful of milliseconds for one particular version of
Python, at the cost of performance plummeting like a stone if the code
ever gets run under an older version, or on Jython, PyPy or IronPython, or
if the algorithm is used in a slightly more complicated way. I would much
rather have consistently good performance, even if not quite the fastest
possible, than a small speed increase on one platform and terrible
performance on everything else.

-- 
Steven.