Nothing to repeat

Ian hobson42 at gmail.com
Sun Jan 9 12:49:10 EST 2011


On 09/01/2011 16:49, Tom Anderson wrote:
> Hello everyone, long time no see,
>
> This is probably not a Python problem, but rather a regular 
> expressions problem.
>
> I want, for the sake of arguments, to match strings comprising any 
> number of occurrences of 'spa', each interspersed by any number of 
> occurrences of the 'm'. 'any number' includes zero, so the whole 
> pattern should match the empty string.
>
> Here's the conversation Python and i had about it:
>
> Python 2.6.4 (r264:75706, Jun  4 2010, 18:20:16)
> [GCC 4.4.4 20100503 (Red Hat 4.4.4-2)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import re
>>>> re.compile("(spa|m*)*")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/usr/lib/python2.6/re.py", line 190, in compile
>     return _compile(pattern, flags)
>   File "/usr/lib/python2.6/re.py", line 245, in _compile
>     raise error, v # invalid expression
> sre_constants.error: nothing to repeat
>
> What's going on here? Why is there nothing to repeat? Is the problem 
> having one *'d term inside another?
>
> Now, i could actually rewrite this particular pattern as '(spa|m)*'. 
> But what i neglected to mention above is that i'm actually generating 
> patterns from structures of objects (representations of XML DTDs, as 
> it happens), and as it stands, patterns like this are a possibility.
>
> Any thoughts on what i should do? Do i have to bite the bullet and 
> apply some cleverness in my pattern generation to avoid situations 
> like this?
>
> Thanks,
> tom
>
I think you want to anchor your list, or anything will match. Perhaps

re.compile('/^(spa(m)+)*$/')

is what you need.

Regards

Ian



More information about the Python-list mailing list