[Numpy-discussion] Looking for people interested in helping with Python compiler to LLVM

Sun Mar 11 12:22:17 EDT 2012

On 03/11/2012 03:52 PM, Pauli Virtanen wrote:
> 11.03.2012 15:12, xavier.gnata at gmail.com kirjoitti:
> [clip]
>> If this description is correct, Numba is an additional pass once the
>> cpython  bytecode has be produced by cpython.
>> Is it correct??
>> Is python bytecote a good intermediate representation to perform numpy
>> related optimization?
>>
>> One current big issue with numpy is that C=A+B+D produces temporaries.
>> numexpr addresses this issue and it would be great to get the same
>> result by default in numpy.
>> numexpr also optimizes polynomials using Horner's method. It is hard to
>> do that at bytecode level, isn't it?
> My impression is that dealing with Python's bytecode is not necessarily
> significantly harder than dealing with the AST.
>
> Your example reads
>
>    1           0 LOAD_NAME                0 (A)
>                3 LOAD_NAME                1 (B)
>                6 BINARY_ADD
>                7 LOAD_NAME                2 (D)
>               10 BINARY_ADD
>               11 STORE_NAME               3 (C)
>
> For instance, interpreting the bytecode (e.g. loop body) once with dummy
> objects lets you know what the final compound expression is.

ok.
>
>> Unladen swallow wanted to replace the full cpython implementation by a
>> jit compiler built using LLVM... but unladen swallow is dead.
> To get speed gains, you need to optimize not only the bytecode
> interpreter side, but also the object space --- Python classes, strings
> and all that. Keeping in mind Python's dynamism, there are potential
> side effects everywhere. I guess this is what sunk the swallow.
Correct : impact of dynamism underestimated and unexpected bugs in LLVM 
--> dead swallow.

> Just speeding up effectively statically typed code dealing with arrays
> and basic types, on the other hand, sounds much easier.
it is :)

> The PyPy guys have a much more ambitious approach, and are getting nice
> results. Also with arrays --- as I understand, the fact that they want
> to be able to do this sort of optimization is the main reason why they
> want to reimplement the core parts of Numpy in RPython.
It sonds ok for numpy but it is no sense because numpy is only the 
beginning of the story.
Numpy is not *that* useful without scipy, matplotlib and so on.

> The second issue is that unfortunately their emulation of CPython's
> C-API is at the moment seems to have quite large overheads. Porting
> Numpy on top of that is possible --- I managed to get basic things
> (apart from string/unicode arrays) to work, but things took very large
> speed hits (of the order ~ 100x for things like `arange(10000).sum()`).
> This pushes the speed advantage of Numpy to a bit too large array sizes.
> The reason is probably that Numpy uses PyObjects internally heavily,
> which accumulates the cost of passing objects through the emulation layer.
Yes. We have two types of users :
1) Guys using python without numpy. These guys want the python core 
language to go faster (with the full complexity of a dynamic language).
2) science users. These guys are fine with python speed because their 
codes spend 99% the time in 
numpy/scipy/any_other_science_module_using_c-api. Who cares if it takes 
1.1s (in python) instead of 1.0 (in C) to read the data input file if it 
takes hours (or even minutes) to process them (using the C-api and doing 
the basics with temporaries and not optimized loops)?