Letter class in re

Antoon Pardon antoon.pardon at rece.vub.ac.be
Tue Mar 10 04:16:43 EDT 2015

Op 09-03-15 om 16:17 schreef Tim Chase:
> On 2015-03-09 15:29, Antoon Pardon wrote:
>> Op 09-03-15 om 13:50 schreef Tim Chase:
>>>>   (?:(?!_|\d)\w)\w+
>>> If you don't have to treat it as an atom, you can simplify that to
>>> just
>>>   (?!_|\d)\w+
>>> which just means that the first character can't be an underscore
>>> or digit.
>>> Though for a Py3 identifier, the underscore is acceptable as a
>>> first character ("__init__"), so you can simplify it even further
>>> to just
>>>   (?!\d)\w+
>> No that doesn't work. To begin with my attempt above shoud have
>> been:
>>     (?:(?!_|\d)\w)\w*
> Did you actually test my suggestion?  The "(?!\d)\w+" means "one or
> more Word characters, but the first one can't be a digit" because
> the "(?!...)" is zero-width. This should match single-character
> strings including a single underscore.

I had done some tests, but due to a misunderstanding I broke off testing
prematurely. I didn't grasp the look ahead nature of the (?! combination
and saw it just as a negation of the regular expression involved.

But IIUC the (?!\d) will check that the next charachter is not a digit
without advancing the position in the string. So that later checking for
\w+ happens as if (?!\d) hadn't been present. So in effect you have part
of the string that is checked against to sub regular expresssions.

Antoon Pardon 

More information about the Python-list mailing list