[Python-Dev] Python Grammar Ambiguity

Michael Foord michael.foord at resolversystems.com
Mon Apr 24 19:03:06 CEST 2006


(oops - should have gone to list)

Guido van Rossum wrote:
> Well, yes, the syntax is supposed to be something like "for varlist in
> testlist". Could you report this as a doc bug (if you found this
> information in the docs)?
>   

I think the documentation (which does put expression_list there)
reflects the current state of the parser.

First of all the grammar in SVN also has expression_list there *and* the
following does successfully parse to an ast (but fails when you compile
the ast) :

    [x for x + 1 in y]

All the best,


Michael Foord


> On 4/24/06, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
>   
>> Hello all,
>>
>> I'm working on a parser for part of the Python language (expressions but
>> not statements basically). I'm using PLY to generate the parser and it's
>> mostly done.
>>
>> I've hit on what looks like a fundamental ambiguity in the Python
>> grammar which is difficult to get round with PLY; and I'm wondering
>> *why* the grammar is defined in this way. It's possible there is a
>> reason that I've missed, which means I need to rethink my workaround.
>>
>> List displays (list comprehensions) are defined as (from
>> http://docs.python.org/ref/lists.html ) :
>>
>>
>> test      ::=      and_test ( "or" and_test )* | lambda_form
>> testlist     ::=     test ( "," test )* [ "," ]
>> list_display     ::=     "[" [listmaker] "]"
>> listmaker     ::=     expression ( list_for | ( "," expression )* [","] )
>> list_iter     ::=     list_for | list_if
>> list_for     ::=     "for" expression_list "in" testlist [list_iter]
>> list_if     ::=     "if" test [list_iter]
>>
>> The problem is that list_for is defined as :
>>
>>     "for" expression_list "in" testlist
>>
>> This allows arbitrary expressions in the 'assignment' part of a list
>> comprehension.
>>
>> As a result, the following is valid syntax according to the grammar :
>>
>>     [x for x + 1 in y]
>>
>> Obviously it isn't valid ! This parses to an ast, but the syntax error
>> is thrown when you compile the resulting ast.
>>
>> The problem is that for the basic case of a list comprehension ( ``[x
>> for x in y]``), ``x in y`` is a valid expression. That makes it
>> extremely hard to disambiguate the grammar so that the ``in`` is treated
>> correctly, and not part of an expression.
>>
>> My question is, why are arbitrary expressions allowed here in the
>> grammar ? As far as I can tell, only identifiers (nested in parentheses
>> or brackets) are valid here. I've got round the problem by creating a
>> new node 'identifier_list' and just having that return the expected
>> syntax tree (actually an expression list). This gets round the ambiguity
>> [#]_.
>>
>> It worries me that there might be a valid expression allowed here that I
>> haven't thought of. My current rules allow anything that looks like
>> ``(a, [b, c, (d, e)], f)`` - any  nested identifier list. Would anything
>> else be allowed ?
>>
>> If not, why not modify the grammar so that the compiler has less
>> possible invalid syntax trees to work with ?
>>
>> (Also the grammar definition of string conversion is wrong as it states
>> that a trailing comma is valid, which isn't the case. As far as I can
>> tell that is necessary to allow nesting string conversions.)
>>
>> Fuzzyman
>> http://www.voidspace.org.uk/python/index.shtml
>>
>>
>> .. [#] If I could make precedence work in PLY I could also solve it I
>> guess. However I can't. :-)
>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
>>     
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
>   




More information about the Python-Dev mailing list