[pypy-dev] array performace?

Fri Jul 2 14:08:35 CEST 2010

On Fri, Jul 2, 2010 at 10:55,  <Ben.Young at sungard.com> wrote:
>> On Fri, Jul 2, 2010 at 2:26 AM,  <Ben.Young at sungard.com> wrote:
>> >> On Fri, Jul 2, 2010 at 1:47 AM, Paolo Giarrusso
>> > <p.giarrusso at gmail.com>
>> >> wrote:
>> >> > On Fri, Jul 2, 2010 at 08:04, Maciej Fijalkowski <fijall at gmail.com>
>> > wrote:
>> >> >> On Thu, Jul 1, 2010 at 1:18 PM, Hakan Ardo <Hakan at ardoe.net> wrote:
>> >> >>> OK, so making an interpreter level implementation of array.array
>> > seams
>> >> >>> like a good idea. Would it be possible to get the jit to remove
>> > the
>> >> >>> wrapping/unwrapping in that case to get better performance than
>> >> >>> _rawffi.Array('d'), which is already an interpreter level
>> >> >>> implementation?
>> >> >>
>> >> >> it should work mostly out of the box (you can also try this for
>> >> >> _rawffi.array part of module, if you want to). It's probably enough
>> > to
>> >> >> enable module in pypy/module/pypyjit/policy.py so JIT can have a
>> > look
>> >> >> there. In case of _rawffi, probably a couple of hints for the jit
>> > to
>> >> >> not look inside some functions (which do external calls for
>> > example)
>> >> >> should also be needed, since for example JIT as of now does not
>> >> >> support raw mallocs (using C malloc and not our GC).
>> >> >
>> >> >> Still, making an
>> >> >> array module interp-level is probably the sanest approach.
>> >> >
>> >> > That might be a bad sign.
>> >> > For CPython, people recommend to write extensions in C for
>> >> > performance, i.e. to make them less maintainable and understandable
>> >> > for performance.
>> >> > A good JIT should make this unnecessary in as many cases as
>> > possible.
>> >> > Of course, the array module might be an exception, if it's a single
>> >> > case.
>> >> > But performance 20x slower than C, with a JIT, is a big warning,
>> > since
>> >> > fast interpreters are documented to be (in general) just 10x slower
>> >> > than C.
>> >>
>> >> There is a lot of unsupported claims in your sentences, however,
>> >> that's not my point.
>> >>
>> >
>> > That's a little harsh. When the JIT was originally developed it was
>> > envisaged that it would be faster to re-write code to app level to give
>> > speed-ups. If that's changed that's fine, but it's not an "unsupported
>> > claim"
>> >
>> > Ben
>> >
>>
>> Unsupported claim is for example that fast interpreters are 10x slower
>> than C.
That's the only unsupported claim, but it comes from "The Structure
and Performance of Eﬃcient Interpreters". I studied that as a student
on VM, you are writing one, so I (unconsciously) guessed that
everybody knows that paper - I know that's a completely broken way of
writing, but I didn't spot it.

>>On what exactly? Did he write this particular benchmark in C
>> and in fast interpreter to compare? Another unsupported claim is that
>> JIT is 20x slower than C here.

I did not claim that - I am aware that it is not even JITted. I
complain against the lack of JITting.

>> Array module is not even JITted,
>> because it's based on _rawffi which itself operates on low-level
>> pointers which JIT does not want to deal with.

I would say that instead of doing manual annotations or rewriting at
the interp-level (which doesn't scale), it would be overall simpler to
make the JIT learn itself how to deal with those calls (i.e. inline
everything around, leave the external call as a call), once and for
all. What you suggest below might be a way to do it.

>> That's exactly the
>> reason why JIT doesn't look into _rawffi module and making it look
>> there doesn't sound like a good idea (instead, we're trying to replace
>> it with something JIT-friendly that knows how to do FFI calls into C,
>> there is a summer of code project).

Well, at the abstraction level I'm speaking, it sounds like there in
the end, the JIT will be able to do what is needed. I am not aware of
the details. But then, at the end of that project, it seems to me that
it should be possible to write the array module in pure Python using
this new FFI interface and have the JIT look at it, shouldn't it?
I do not concentrate on array specifically - rewriting a few modules
at interpreter level is fine. But as a Python developer I should have
no need for that.

>> All I'm trying to say is that there are valid reasons that array
>> module should be on interpreter level and none of this has anything to
>> do with incapabilities of the JIT.

> Fair enough, and I do see your point, but I think Paolo comment was not aimed at array, just the implication (in this case) that to get performance you need to re-write in rpython. I think his point in general is correct, even if he picked the wrong example to mention it :) (and his 20x claim comes from the original email, so I don't think it's entirely unsupported)

Thanks for understanding my point. I'm unsure whether an ideal JIT
could allow leaving array at the app-level (and I noted also in the
original mail that I was unsure on this).

> Of course in this case I'm sure there are good reasons, but it is certainly interesting to see the push towards more rpython code than app-level. I guess that's because the JIT can "see" and accelerate rpython code too I believe, so it’s win-win (because of the code size issues and things like that)

> Incidentally, is there a reason that geninterped code is so bloated compared to rpython code that looks like it could have been generated from the app-level equivalent? Would there be a way of annotating the app-level code so that when it's geninterped it's as tight as the equivalent rpython?

-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/