[Python-ideas] New __reference__ hook

Amaury Forgeot d'Arc amauryfa at gmail.com
Wed Dec 5 21:59:23 CET 2012


On 5 December 2012 18:09, Sturla Molden <sturla at molden.no> wrote:
> >
> > Den 5. des. 2012 kl. 19:51 skrev Masklinn <masklinn at masklinn.net>:
> >
> >>
> >> Why? z could just be a "lazy value" at this point, basically a manual
> >> building of thunks, only reifying them when necessary (whenever that
> >> is). It's not like numpy *has* to create three temporary arrays, just
> >> that it *does*.
> >>
> >
> > It has to, because it does not know when to flush an expression. This
> strangely enough, accounts for most of the speed difference between
> Python/NumPy and e.g. Fortran 95. A Fortran 95 compiler can compile an
> array expression as a single loop. NumPy cannot, because the binary
> operators does not tell when an expression is "finalized". That is why the
> numexpr JIT compiler evaluates Python expressions as strings, and needs to
> include a parser and whatnot. Today, most numerical code is memory bound,
> not compute bound, as CPUs are immensely faster than RAM. So what keeps
> numerical/scientific code written in Python slower than C or Fortran today
> is mostly creation of temporary array objects – i.e. memory access –, not
> the computations per se. If we could get rid of temprary arrays, Python
> codes could possibly achieve 80 % of Fortran 95 speed. For scientistis that
> would mean we don't need to write any more Fortran or C.
> >
> > But perhaps it is possible to do this with AST magic? I don't know. Nor
> do I know if __bind__ is the best way to do this. Perhaps not. But I do
> know that automatically detecting when to "flush a compund expression with
> (NumPy?) arrays" would be the holy grail for scientific computing with
> Python. A binary operator x+y would just return a symbolic representation
> of the expression, but when the full expression needs to be flushed we can
> e.g. ask OpenCL or LLVM to generate the code on the fly. It would turn
> numerical computing into something similar to dynamic HTML. And we know how
> good Python is at generating structured text on the fly.
>
>
FYI, the numpy module shipped with PyPy does exactly this: the operations
are recorded in some AST structure, which is evaluated only when the first
item of the array is read.
This is completely transparent to the user, or to other parts of the
interpreter.

PyPy uses JIT techniques to generate machine code specialized for the
particular AST, and is typically 2x to 5x faster than Numpy, probably
because a lot of allocations/copies are avoided.

-- 
Amaury Forgeot d'Arc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121205/96229ab7/attachment.html>


More information about the Python-ideas mailing list