Two RE proposals

David LeBlanc whisper at oz.net
Fri Jul 26 17:50:27 EDT 2002


>     David> 1. Add a substitution operator - in the example below
> it's "!<..>"
>
>     David> word = r"\w*"
>     David> punct = r"[,.;?]"
>     David> wordpunct = re.compile(r"!<word>!<punct>")
>
> How about
>
>     word = r"\w*"
>     punct = r"[,.;?]"
>     wordpunct = re.compile(r"%(word)s%(punct)s" % locals())
>
> which you can do today?  (I'd also argue that a word would be "\w+".)

I considered something like this, but it's too verbose, not to mention
confusing - what's inherently wrong with my idea? I don't think it's
counter-pythonic. I also considered the (?P!<name>) construct, but it's on
the verbose side too. (I actually need to go read up on this if it's
possible to find doc for it - I am not familiar with the idiom of
"locals()".)

>     David> 2. Make r"(a|b)*" mean any number of a's or b's. This doesn't
>     David>    work, at least in some situations with the current
> re compiler
>     David>    - the "any" op "*" doesn't seem to span over a parened
>     David>    group.
>
> The * doesn't (and shouldn't) operate over grouping parens.  You're asking
> it to supply you with a variable number of groups, which it can't do.

You're right - it doesn't operate over grouping parens, but why _shouldn't_
it? IIRC, _some_ regex pacakges could do this...

> Besides, what's wrong with r"([ab]*)"?

Nothing - unless a or b are more then single charachters or literal strings.

Dave LeBlanc





More information about the Python-list mailing list