Letter class in re

Chris Angelico rosuav at gmail.com
Mon Mar 9 10:39:30 EDT 2015


On Tue, Mar 10, 2015 at 1:34 AM, Antoon Pardon
<antoon.pardon at rece.vub.ac.be> wrote:
>> There is str.isidentifier, which returns True if something is a valid
>> identifier name:
>>
>> >>> '℮'.isidentifier()
>> True
>
> Which is not very usefull in a context of lexical analysis. I don't need to know
> if a particular string is useful as an identifier, I want to know which parts of
> a text are identifiers.

If you're doing lexical analysis, you probably want a lexer. For
Python, I would recommend parsing to AST and doing your analysis on
that; I've had pretty good success doing that, and then using the
line/column info to go back to the original text if I need it. A regex
is probably not going to be sufficient for that kind of work.

What exactly are you trying to accomplish here? More info would guide
the recommendations, obviously.

ChrisA



More information about the Python-list mailing list