`re' difficulty?

Darrell darrell at dorb.com
Mon Oct 18 22:24:57 EDT 1999


At the risk of showing my ignorance, this makes sense to me.
[.0-9]+[a-z]? matched, so there was no need to find a match on the right
hand side ???

>>> re.match('parted-([.0-9]+-pre[0-9]+)', entry).group(1)
'0.0.8-pre1'
>>> re.match('parted-([.0-9]+[a-z]?|[.0-9]+-pre[0-9]+)', entry).group(1)
'0.0.8'


--Darrell
----- Original Message -----
From: François Pinard <pinard at IRO.UMontreal.CA>
To: <python-list at python.org>
Sent: Monday, October 18, 1999 9:57 PM
Subject: `re' difficulty?


> Hi, people.  I got a strangety, here, on this machine running Python
1.5.1.
> (This is the machine where the TP repository is kept, I'm not really the
> guy installing software on it.)  Here is what I got:
>
> >>> entry = 'parted-0.0.8-pre1/po/parted.pot'
> >>> re.match('parted-([.0-9]+[a-z]?|[.0-9]+-b[0-9]+|[.0-9]+-pre[0-9]+)',
entry).group(1)
> '0.0.8'
> >>> re.match('parted-([.0-9]+-b[0-9]+|[.0-9]+-pre[0-9]+|[.0-9]+[a-z]?)',
entry).group(1)
> '0.0.8-pre1'
>
> As you may see, between parentheses, the second line has A|B|C, while
> the third has B|C|A.  Since the results are not equivalent, I presume the
> longest match does not apply here, as it was usual for me so far, whenever
> regular expressions are concerned.
>
> May I guess this is all implemented with backtracking, with the first
> matching alternative shadowing the remaining alternatives?  Isn't that
> commiting Python to a behaviour prohibiting later optimisations?  Or is
> the exact behaviour just undefined?  What's the story? :-)
>
> --
> François Pinard   http://www.iro.umontreal.ca/~pinard
>
> --
> http://www.python.org/mailman/listinfo/python-list





More information about the Python-list mailing list