[ python-Bugs-1116571 ] Wrong match with regex, non-greedy problem
SourceForge.net
noreply at sourceforge.net
Sat Feb 12 18:12:33 CET 2005
Bugs item #1116571, was opened at 2005-02-05 01:12
Message generated for change (Settings changed) made by effbot
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1116571&group_id=5470
Category: Regular Expressions
Group: Python 2.4
>Status: Closed
>Resolution: Invalid
Priority: 5
Submitted By: rengel (engel_re)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: Wrong match with regex, non-greedy problem
Initial Comment:
# This is executable.
# My test string ist rather long:
tst = "In this <c:noun:ns>Buch</c:noun>, used to
designate <c:noun:np>Dinge der Wirklichkeit</c:noun>
rather than <c:noun:fs>SW</c:noun>
<c:noun:ns>Ent</c:noun>."
# I want to match the last part of the string:
# <c:noun:fs>SW</c:noun> <c:noun:ns>Ent</c:noun>
# So I define the following pattern an compile it:
pat = r"<c:noun:(.*?)>(.*?)</c:noun>
<c:noun:(.*?)>(.*?)</c:noun>"
rex = re.compile(pat)
# Then I search the string to get a match group :
mat = rex.search(tst)
# If found, print the group
if mat: print mat.group()
# Instead of
# <c:noun:fs>SW</c:noun> <c:noun:ns>Ent</c:noun>
# I get the whole string starting with
# <c:noun:ns>Buch</c:noun>...
# up to the very last </c:noun>
# Apparently the non-greedy operator doesn't work
correctly.
# What's wrong?
----------------------------------------------------------------------
Comment By: Fredrik Lundh (effbot)
Date: 2005-02-08 09:27
Message:
Logged In: YES
user_id=38376
Search returns the first (left-most) location where the
pattern matches, if any. The non-greedy operator only
guarantees that you get the shortest possible match at that
location.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1116571&group_id=5470
More information about the Python-bugs-list
mailing list