regex problem
John Machin
sjmachin at lexicon.net
Tue Jul 26 10:06:09 EDT 2005
Duncan Booth wrote:
> John Machin wrote:
>
>
>>So here's the mean lean no-flab version -- you don't even need the
>>parentheses (sorry, Thomas).
>>
>>
>>>>>rx1=re.compile(r"""\b\d\d\d\d,|\b\d\d\d\d-\d\d\d\d,""")
>>>>>rx1.findall("1234,2222-8888,4567,")
>>
>>['1234,', '2222-8888,', '4567,']
>
>
> No flab? What about all that repetition of \d? A less flabby version:
>
>
>>>>rx1=re.compile(r"""\b\d{4}(?:-\d{4})?,""")
>>>>rx1.findall("1234,2222-8888,4567,")
>
> ['1234,', '2222-8888,', '4567,']
>
OK, good idea to factor out the prefix and follow it by optional -1234.
However optimising re engines do common prefix factoring, *and* they
rewrite stuff like x{4} as xxxx.
Cheers,
John
More information about the Python-list
mailing list