re module non-greedy matches broken

John Ridley ojokimu at yahoo.co.uk
Tue Apr 5 15:07:43 EDT 2005


--- lothar <lothar at ultimathule.nul> wrote:
> a non-greedy match is implicitly defined in the documentation to be
> one such
> that there is no proper substring in the return which could also
> match the regex.
> 

If I understand this correctly, what you are asking is for re to look
for, or rather, anticipate, over-lapping matches (e.g. tags nested
inside tags). But the module documentation at "4.2.3 Module Contents"
specifically states that functions like re.findall do not do this. So I
think it's not that the regex engine is not up to the task - rather
that you may need to take into account the behaviour of some of the
module functions when composing your regexes. I'm sure it's possible to
do what you want using a regex alone - however, it may also be worth
looking at rolling your own search functions in order to get finer
control over the strategy of searching.


John Ridley

Send instant messages to your online friends http://uk.messenger.yahoo.com 



More information about the Python-list mailing list