How does python build its AST

Jason tenax.raccoon at gmail.com
Fri Dec 7 11:49:27 EST 2007


On Dec 7, 9:03 am, MonkeeSage <MonkeeS... at gmail.com> wrote:
> On Dec 7, 9:50 am, Kay Schluehr <kay.schlu... at gmx.net> wrote:
>
>
>
> > On Dec 7, 3:23 pm, MonkeeSage <MonkeeS... at gmail.com> wrote:
>
> > > A quick question about how python parses a file into compiled
> > > bytecode. Does it parse the whole file into AST first and then compile
> > > the AST, or does it build and compile the AST on the fly as it reads
> > > expressions? (If the former case, why can't functions be called before
> > > their definitions?)
>
> > > Thanks,
> > > Jordan
>
> > Python uses a highly optimized table based LL(1) parser to create a
> > syntax tree. In Python 2.5 it transforms the concrete syntax tree
> > ( CST ) into an AST before compilation. Before that it compiled the
> > CST directly. I'm not sure what you are asking for ( in parentheses )?
> > Parser actions or preprocessing the tree? The latter is definitely
> > possible and you can build your own compilation machinery using the
> > parser module and the compile function.
>
> > Kay
>
> Thanks for your reply. You answered my main question. The secondary
> question is why is it a NameError to try to use a variable/function
> prior to the declaration in a source file, since python has already
> seen the declaration on the first pass building the CST/AST? At
> compile time, shouldn't it already know about it? (Forgive my
> ignorance.)
>
> Regards,
> Jordan

Remember that Python is a highly dynamic language.  You can't
guarantee that a name will be accessible until the actual execution
point that you try to access it.  For example:

>>> Hello()  # What should this call?
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'Hello' is not defined
>>> # The standard hello function
>>> def Hello(): print 'Hello!'
...
>>> Hello()
Hello!
>>>
>>> # Modification through the globals dictionary
>>> def Vikings(): print 'Spam, spam, spam! Spam!'
...
>>> globals()[ 'olleH'[::-1] ] = Vikings
>>> Hello()
Spam, spam, spam! Spam!
>>>
>>> # Another dictionary-based modification
>>> globals()['Hello'] = lambda: 'Go away!'
>>> Hello()
'Go away!'
>>>
>>> # Remove the syntactic sugar and make a function object directly
>>> import new
>>> import compiler
>>> new_code = compiler.compile( 'print "Die in a fire!"', 'Hateful', 'single')
>>> Hello = new.function( new_code, {}, 'Angry')
>>> Hello()
Die in a fire!
>>>
>>> # A callable object (not a function!)
>>> class AnnoyingNeighbor(object):
...     def __call__(self):
...         print 'Hi-diddly-ho, neighbor!'
...
>>> Hello = AnnoyingNeighbor()
>>> Hello()
Hi-diddly-ho, neighbor!
>>>

If this was in a file, which version of Hello should the first call go
to?  Should the 'Hello' name be bound to the function, the callable
class instance, the result of new.function?  What if another
statement

The problem is that, in Python, functions are first-class objects,
just like class definitions and other variables.  The 'def' statement
is something that gets executed.  It compiles the code block for the
function into a function object, then assigns it to the name of the
function.

When Python hits a function call, it must look up the name in the
current name dictionary.  In any but that most simple cases, the name
may not refer to its previous values: any intervening calls or
statements could have rebound that name to another object, either
directly, through side-effects, or even through the C interface.  You
can't call a function before its def statement is parsed any more than
you can print the value of a variable before you assign anything to
it.

Unlike C, functions are not special under Python.  Unlike C++, classes
aren't terribly special either.  They have syntactic sugar to help the
coding go down, but they are only Python objects, and are bound to the
same rules as all other Python objects.

  --Jason



More information about the Python-list mailing list