[pypy-dev] An idea about automatic parallelization in PyPy/RPython

Fri Nov 21 07:36:54 CET 2014

You get traces by running PYPYLOG=jit-log-opt,jit-backend:<filename> pypy ....

There is a tool call jitviewer for viewing those traces. OpenCL is
likely just written in C and the kernel itself does not contain any
Python.

On Fri, Nov 21, 2014 at 3:17 AM, 黄若尘 <hrc706 at gmail.com> wrote:
> Hi Fijaklowski,
>
> Thank you very much for your reply.
>
> Yes, you are right, it’s too hard for me to implement automatic parallelization for the whole PyPy’s trace JIT. I think maybe I can firstly do some work with a very simple interpreter (for example the example-interpreter introduced by PyPy documentation), and try to change some behaviors of RPython JIT.
>
> By the way, could you tell me how can I get the traces and handle them before compiled to native code? I just want to try to convert some of the traces to OpenCL kernel codes and run them in other devices like GPU.
>
> Best Regards,
> Huang Ruochen
>
>> 在 2014年11月21日，上午12:05，Maciej Fijalkowski <fijall at gmail.com> 写道：
>>
>> Hi 黄若尘
>>
>> This is generally a hard problem that projects like GCC or LLVM didn't
>> get very far. The problem is slightly more advanced with PyPys JIT,
>> but not much more.
>>
>> However, the problem is you can do it for simple loops, but the
>> applications are limited outside of pure numerics (e.g. numpy) and
>> also doing SSE stuff in such cases first seems like both a good
>> starting point and a small enough project for master thesis.
>>
>> Cheers,
>> fijal
>>
>> On Tue, Nov 18, 2014 at 3:46 AM, 黄若尘 <hrc706 at gmail.com> wrote:
>>> Hi everyone,
>>>
>>>   I’m a master student in Japan and I want to do some research in PyPy/RPython.
>>>   I have read some papers about PyPy and I also had some ideas about it.  I have communicated with Mr. Bloz and been advised to send my question here.
>>>
>>>   Actually, I wonder if it is possible to make an automatic parallelization for the trace generated by JIT, that is, check if the hot loop is a parallel loop, if so, then try to run the trace parallel in multi-core CPU or GPU, make it faster.
>>>   I think it maybe suitable because:
>>>   1. The traced-base JIT is targeting on loops, which is straight to parallel computation.
>>>   2. There is no control-flow in trace, which is suitable to the fragment program in GPU.
>>>   3. We may use the hint of @elidable in interpreter codes, since the elidable functions are nonsensitive in the execution ordering so can be executed parallel.
>>>
>>>   What do you think about it?
>>>
>>> Best Regards,
>>> Huang Ruochen
>>> _______________________________________________
>>> pypy-dev mailing list
>>> pypy-dev at python.org
>>> https://mail.python.org/mailman/listinfo/pypy-dev
>