Two RE proposals
Skip Montanaro
skip at pobox.com
Fri Jul 26 21:02:22 EDT 2002
>> How about
>>
>> word = r"\w*"
>> punct = r"[,.;?]"
>> wordpunct = re.compile(r"%(word)s%(punct)s" % locals())
>>
>> which you can do today? (I'd also argue that a word would be "\w+".)
David> I considered something like this, but it's too verbose, not to
David> mention confusing - what's inherently wrong with my idea?
Nothing I suppose, except someone has to write the code to implement it,
while the proposal I put forth exists today. As for verbosity, "!<word>"
saves precisely one character over "%(word)s". I'll grant you the "%
locals()" adds a few more characters, but it's a constant factor.
I don't understand how a basic facility of the language that has been around
for God knows how long could be more confusing than writing regular
expressions. <wink>
David> I am not familiar with the idiom of "locals()".)
>From the online help:
Help on built-in function locals:
locals(...)
locals() -> dictionary
Return the dictionary containing the current scope's local variables.
>> The * doesn't (and shouldn't) operate over grouping parens. You're
>> asking it to supply you with a variable number of groups, which it
>> can't do.
David> You're right - it doesn't operate over grouping parens, but why
David> _shouldn't_ it? IIRC, _some_ regex pacakges could do this...
How about using non-grouping parens:
>>> pat = re.compile(r"((?:a|b)*)")
>>> pat.match("ababaaaabccdabab")
<_sre.SRE_Match object at 0x40348ea0>
>>> _.group(1)
'ababaaaab'
Skip
More information about the Python-list
mailing list