Python's parser.

William Tanksley wtanksle at dolphin.openprojects.net
Wed May 10 14:29:57 EDT 2000


On 10 May 2000 18:53:22 +0100, Michael Hudson wrote:
>I thought I might waste some time by having a crack at implementing +=
>and friends in Python.  I think I know what I want to do on the
>codegen side, but at the moment that hurdle looks some way off.

Sounds like fun.

>How much effort is involved to get Python to accept a '*=' token?

On the code side, you seem to be developing an accurate idea.  On the
social side, you can either rework all of Python's smeantics so that +=
and so on makes consistent sense, or you can kill Guido and use his time
machine to take his place (careful!  Killing people who have a time
machine is VERY unlikely to work!).

I love this newsgroup -- violence is a solution to every problem.

>I've 

>1) Added a line 
>aexpr_stmt: testlist '*=' testlist
>at the end of Grammar/Grammar, and changed

I don't know Python's grammar, but in the ones I've used you would have to
use a symbolic constant, related to TIMESEQUAL, instead of '*='.

>small_stmt: expr_stmt | print_stmt  | del_stmt | pass_stmt 
>          | flow_stmt | import_stmt | global_stmt | exec_stmt 
>          | assert_stmt

>to

>small_stmt: expr_stmt | aexpr_stmt | print_stmt | del_stmt | pass_stmt
>          | flow_stmt | import_stmt | global_stmt | exec_stmt 
>          | assert_stmt

This looks okay in and of itself, although I don't have the source code to
check.  One warning light is staying on, though: I'm surprised that you
didn't place your assignment statement alongside the other assignment
statement.

>2) changed

>#define TIMESEQUAL      37
>/* Don't forget to update the table _PyParser_TokenNames in tokenizer.c! */
>#define OP              38

>3) Added "TIMESEQUAL" to _PyParser_TokenNames in tokenizer.c

Good.  As I mentioned in #1, I'm not sure whether you're using the token
name correctly (but again, I'm only familiar with yacc).

>4) Changed 

>                switch (c2) {
>                case '=':       return GREATEREQUAL;
>                case '>':       return RIGHTSHIFT;
>                }
>                break;

>in Parser/tokenizer.c:PyToken_TwoChars to:

>                switch (c2) {
>                case '=':       return GREATEREQUAL;
>                case '>':       return RIGHTSHIFT;
>                }
>                break;

You're leaving something out here -- these two snippets of code appear
identical.  This is also not the right place to make this modifiation,
although it's right nearby.

>Help?  Anyone?  This can't be *that* hard, can it?

It is.  Easily.  This is why the canonical book on parsing has a picture
of a dragon on the cover.  Good luck.

>Michael 

-- 
-William "Billy" Tanksley



More information about the Python-list mailing list