re

David C. Ullrich dullrich at sprynet.com
Thu Jun 5 08:58:14 EDT 2008


On Wed, 04 Jun 2008 20:07:41 +0200, "Diez B. Roggisch"
<deets at nospam.web.de> wrote:

>> Whitespace is actually \s. But [\s]disc[whatever]
>> doesn't do the job - then it won't match "(disc)",
>> which counts as "disc appearing as a full word.
>
>Ok, then this works:

Yes it does.

My real question was why doesn't a construction like

  (A|B)C

work as expected. The code below shows that it does.
That puzzled me because I couldn't see any real
difference between your solution here and things
I'd tried that didn't work. But those things also
work in the code below - when I saw this just
now I was even more confused...

Oh. Turns out the actual reason for the confusion wasn't
regex syntax, it was the fact that findall doesn't
return what I thought it did - looking at the result
of findall() it seemed as thought the re was matching
empty strings and whitespace... Looking more
carefully at what findall is supposed to do everything
makes sense.

Sorry to be dense. Remind me to read more than the
first sentence next time:

"findall (pattern, string)
    Return a list of all non-overlapping matches of pattern in string.
If one or more groups are present in the pattern, return a list of
groups;..."

>import re
>
>test = """
>disc
>(disc)
>foo disc bar
>discuss
>""".split("\n")
>
>for t in test:
>     if re.search(r"(^|[^\w])(disc)($|[^\w])", t):
>         print "success:", t
>
>
>> Also I think you have ^ and $ backwards, and there's
>> a ^ I don't understand. I _think_ that a correct version
>
>Yep, sorry for the confusion.
>
>Diez

David C. Ullrich



More information about the Python-list mailing list