[pypy-dev] a possible leak in the object namespace...

Alex A. Naanou alex.nanou at gmail.com
Mon Nov 29 21:02:19 CET 2010


On Mon, Nov 29, 2010 at 21:46, Carl Friedrich Bolz <cfbolz at gmx.de> wrote:
> Hi Alex,
>
> On 11/29/2010 03:04 PM, Alex A. Naanou wrote:
>> With the release of version 1.4, I decided to test these usecases out
>> and benchmark them on PyPy and 15 minutes later I got results that
>> were surprising to say the least...
>>
>> Expectations:
>> 1) the normal/native namespace should have been a bit faster than the
>> hooked object on the first run. Both cases should have leveled to
>> about the same performance after the JIT finished it's job +/- a
>> constant.
>> 2) all times should have been near constant.
>>
>> What I got per point:
>> 1) the object with native dict was slower by about three orders of
>> magnitude than the object with a hooked namespace.
>> 2) sequential write benchmark runs on the normal object did not level
>> out, as they did with the hook, rather, they exhibited exponential
>> times (!!)
>
> Don't do that then :-).

:)

>
>
>> For details and actual test code see the attached file.
>
> The code you are trying is essentially this:
>
> def test(obj, count=10000):
>        t0 = time.time()
>        for i in xrange(count):
>                setattr(obj, 'a' + str(i), i)
>        t1 = time.time()
>        # return: total, time per write
>        return t1 - t0, (t1 - t0)/count
>
> This is not working very well with the non-overridden dict, because we
> don't optimize for this case at all. You are using
>
>  a) lots of attributes, which we expect to be rare
>  b) access them with setattr, which is a lot slower than a fixed attribute
>  c) access a different attribute every loop iteration, which means the
> compiler has to produce one bit of code for every attribute

This is intentional (all three points), I did not want the jit to
factor out the loop -- I wanted to time the initial attribute
creation...



>
> Read this, for some hints why this is the case:
>
> http://morepypy.blogspot.com/2010/11/efficiently-implementing-python-objects.html
>
> This is in theory fixable with enough work, but I am not sure that this
> is a common or useful use case. If you really need to do this, just use
> a normal dictionary. Or show me some real-world code that does this, and
> might think about the case some more.
>
> Anyway, the timing behavior of the above loop is merely quadratic in the
> number of attributes, not exponential :-).

Accepted, my mistake :)

But quadratic behaviour + a three orders of magnitude increase in time
it takes to create an attr is scarry... but you are right, how often
does that usecase happen? :)


Retested with:
  setattr(obj, 'a', i)

The results are *allot* better and it is indeed the common case :)



Thanks!

>
> Cheers,
>
> Carl Friedrich
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>



-- 
Alex.



More information about the Pypy-dev mailing list