why are re group names so restrictive?

Tim Peters tim.one at comcast.net
Fri May 9 16:17:55 EDT 2003


[Skip Montanaro]
> Why does re restrict the characters in group names to be Python
> identifiers?  They seem to only be used where strings are allowed,
> thus the character set should only exclude ">" and ")" (and possibly
> "<" and ")" for symmetry).

There you go -- the current rule saves argument about exactly which
characters shouldn't be allowed.  Since lots of non-alphanumeric characters
in regexps have meta-meanings, there are lots of possibilities for
confusion.

> ...
> Are there some contexts where group names are used like Python
> identifiers which force this restriction?  I could understand the
> restriction if groups could be accessed as attributes of a match
> object, e.g.:
> ...
> but that isn't possible.

And never would be, if the restriction were relaxed.  As is, you could
certainly write a class exposing group names as attributes.

They're supposed to be symbolic names, and in Python all names have
identifier syntax.  There didn't seem any point to making up a unique syntax
for group identifiers, apart from having another piece of cryptic regexp
syntax to document and trip over.






More information about the Python-list mailing list