[Python-ideas] Enabling access to the AST for Python code

David Wilson dw+python-ideas at hmmz.org
Fri May 22 05:02:10 CEST 2015


This sounds like a cool feature, though I'm not sure if exposing the AST
directly on the code object is the best approach..

Attaching the AST to the code object implies serializing (and
deserializing into nicely sparse heap allocations) it via .pyc
files, since code objects are marshalled there.

What about improving the parser so that exact start/end positions are
recorded for function bodies? This might be represented as 2 cheap
integers in RAM, allowing for a helper function in the compiler or
inspect modules (inspect.ast()?) to handle the grunt work.
Implementations like Micropython could just stub out those fields with
-1 or whatever else if desired.

One upside to direct attachment would be that a function returned by
e.g. eval() with no underlying source file would still have its AST
attached, without the caller having to keep hold of the unparsed string,
but the downside of RAM/disk/potentially hefty deserialization
performance seems to outweigh that.

I also wish there was a nicer way of introducing an expression that was
to be represented as an AST, but I think that would involve adding
another language keyword, and simply overloading the meaning of
generators slightly seems preferable to that. :)


David

On Thu, May 21, 2015 at 09:18:24PM -0400, Ben Hoyt wrote:
> Hi Python Ideas folks,
> 
> (I previously posted a similar message on Python-Dev, but it's a
> better fit for this list. See that thread here:
> https://mail.python.org/pipermail/python-dev/2015-May/140063.html)
> 
> Enabling access to the AST for compiled code would make some cool
> things possible (C# LINQ-style ORMs, for example), and not knowing too
> much about this part of Python internals, I'm wondering how possible
> and practical this would be.
> 
> Context: PonyORM (http://ponyorm.com/) allows you to write regular
> Python generator expressions like this:
> 
>     select(c for c in Customer if sum(c.orders.price) > 1000)
> 
> which compile into and run SQL like this:
> 
>     SELECT "c"."id"
>     FROM "Customer" "c"
>     LEFT JOIN "Order" "order-1" ON "c"."id" = "order-1"."customer"
>     GROUP BY "c"."id"
>     HAVING coalesce(SUM("order-1"."total_price"), 0) > 1000
> 
> I think the Pythonic syntax here is beautiful. But the tricks PonyORM
> has to go to get it are ... not quite so beautiful. Because the AST is
> not available, PonyORM decompiles Python bytecode into an AST first,
> and then converts that to SQL. (More details on all that from author's
> EuroPython talk at http://pyvideo.org/video/2968)
> 
> PonyORM needs the AST just for generator expressions and
> lambda functions, but obviously if this kind of AST access feature
> were in Python it'd probably be more general.
> 
> I believe C#'s LINQ provides something similar, where if you're
> developing a LINQ converter library (say LINQ to SQL), you essentially
> get the AST of the code ("expression tree") and the library can do
> what it wants with that.
> 
> (I know that there's the "ast" module and ast.parse(), which can give
> you an AST given a *source string*, but that's not very convenient
> here.)
> 
> What would it take to enable this kind of AST access in Python? Is it
> possible? Is it a good idea?
> 
> -Ben
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


More information about the Python-ideas mailing list