`re' difficulty?
François Pinard
pinard at IRO.UMontreal.CA
Mon Oct 18 21:57:42 EDT 1999
Hi, people. I got a strangety, here, on this machine running Python 1.5.1.
(This is the machine where the TP repository is kept, I'm not really the
guy installing software on it.) Here is what I got:
>>> entry = 'parted-0.0.8-pre1/po/parted.pot'
>>> re.match('parted-([.0-9]+[a-z]?|[.0-9]+-b[0-9]+|[.0-9]+-pre[0-9]+)', entry).group(1)
'0.0.8'
>>> re.match('parted-([.0-9]+-b[0-9]+|[.0-9]+-pre[0-9]+|[.0-9]+[a-z]?)', entry).group(1)
'0.0.8-pre1'
As you may see, between parentheses, the second line has A|B|C, while
the third has B|C|A. Since the results are not equivalent, I presume the
longest match does not apply here, as it was usual for me so far, whenever
regular expressions are concerned.
May I guess this is all implemented with backtracking, with the first
matching alternative shadowing the remaining alternatives? Isn't that
commiting Python to a behaviour prohibiting later optimisations? Or is
the exact behaviour just undefined? What's the story? :-)
--
François Pinard http://www.iro.umontreal.ca/~pinard
More information about the Python-list
mailing list