Where regexs listed for Python language's tokenizer/lexer?

Miles Kaufmann milesck at umich.edu
Sat Sep 12 02:41:42 EDT 2009


On Sep 11, 2009, at 11:10 PM, Chris Seberino wrote:

> Where regexs listed for Python language's tokenizer/lexer?
>
> If I'm not mistaken, the grammar is not sufficient to specify the
> language....
> you also need to specify the regexs that define the tokens
> right?..where is that?

The Python tokenization process is described here:

http://docs.python.org/reference/lexical_analysis.html

The tokenizer can't be expressed in terms of regular expressions,  
because it's non-regular (thanks to things like matching nested braces  
and keeping track of the indentation level).

-Miles




More information about the Python-list mailing list