How does python build its AST

MonkeeSage MonkeeSage at gmail.com
Fri Dec 7 23:57:42 EST 2007


On Dec 7, 4:29 pm, "Terry Reedy" <tjre... at udel.edu> wrote:
> "MonkeeSage" <MonkeeS... at gmail.com> wrote in message
>
> news:79c1f3ea-aeeb-4607-b30d-48ad51b52996 at x69g2000hsx.googlegroups.com...
> |A quick question about how python parses a file into compiled
> | bytecode. Does it parse the whole file into AST first and then compile
> | the AST, or does it build and compile the AST on the fly as it reads
> | expressions? (If the former case, why can't functions be called before
> | their definitions?)
>
> The direct answer is that names cannot be entered into namespaces and bound
> to objects to be looked up until the corresponding object is created by
> executing the corresponding code.  Compiling Python code creates the
> internal code needed to create Python objects, but only exceptionally
> creates Python objects in the process.  In particular, compiling a function
> may create code objects (since the code is a constant) referenced by the
> function creation code, but not function objects themselves.
>
> A less direct answer is the Python is designed to by usable interactively.
> In CPython interactive mode, you enter and the interpreter compiles and
> executes one top(module)-level statement at a time.  Calling a function you
> have not yet entered would be magical.
>
> Terry Jan Reedy

Thanks for your replies Kay, Michael and Terry. To summarize my
understanding of your answers:

- Python (meaning CPython here) first does a parsing pass over the
entire file, with 2.5+ building a syntax tree and prior versions
building a parse tree.

- It then compiles the tree into bytecode, by walking down the nodes
recursively from the top down.

- When in interactive mode (e.g., python prompt), it builds the tree
and compiles it on the fly, as individual expressions are parsed.

Is this correct? If so, may I pick your brain on two more points?

1.) What is the benefit of doing a two phase compilation (parsing/
compiling), rather than a single, joint parse + compile phase (as in
interactive mode)?

2.) Wouldn't it be possible on the parsing phase to "tag" names as
valid, even if they occur prior to the assignment of the name, if on a
later branch that assignment is found (and have the compiler be aware
of such tags)?

The reason I'm wondering about these things is that on a different
group, it came up that perl allows referencing before assignment,
which seems to require a two-phase compilation, which made me wonder
how python does things. And since (if I haven't misunderstood), python
does use two-phase compilation, I just wondered if it would be
possible to do what perl does. I'm not advocating it as a feature for
python (it seems bass-ackwards to me to reference a name before it's
assigned to in the script), this is just curiosity.

Thanks,
Jordan



More information about the Python-list mailing list