regex question

proctor 12cc104 at gmail.com
Fri Apr 27 11:12:20 EDT 2007


On Apr 27, 8:26 am, Michael Hoffman <cam.ac... at mh391.invalid> wrote:
> proctor wrote:
> > On Apr 27, 1:33 am, Paul McGuire <p... at austin.rr.com> wrote:
> >> On Apr 27, 1:33 am, proctor <12cc... at gmail.com> wrote:
> >>> rx_test = re.compile('/x([^x])*x/')
> >>> s = '/xabcx/'
> >>> if rx_test.findall(s):
> >>>         print rx_test.findall(s)
> >>> ============
> >>> i expect the output to be ['abc'] however it gives me only the last
> >>> single character in the group: ['c']
>
> >> As Josiah already pointed out, the * needs to be inside the grouping
> >> parens.
> > so my question remains, why doesn't the star quantifier seem to grab
> > all the data.
>
> Because you didn't use it *inside* the group, as has been said twice.
> Let's take a simpler example:
>
>  >>> import re
>  >>> text = "xabc"
>  >>> re_test1 = re.compile("x([^x])*")
>  >>> re_test2 = re.compile("x([^x]*)")
>  >>> re_test1.match(text).groups()
> ('c',)
>  >>> re_test2.match(text).groups()
> ('abc',)
>
> There are three places that match ([^x]) in text. But each time you find
> one you overwrite the previous example.
>
> > isn't findall() intended to return all matches?
>
> It returns all matches of the WHOLE pattern, /x([^x])*x/. Since you used
> a grouping parenthesis in there, it only returns one group from each
> pattern.
>
> Back to my example:
>
>  >>> re_test1.findall("xabcxaaaxabc")
> ['c', 'a', 'c']
>
> Here it finds multiple matches, but only because the x occurs multiple
> times as well. In your example there is only one match.
>
> > i would expect either 'abc' or 'a', 'b', 'c' or at least just
> > 'a' (because that would be the first match).
>
> You are essentially doing this:
>
> group1 = "a"
> group1 = "b"
> group1 = "c"
>
> After those three statements, you wouldn't expect group1 to be "abc" or
> "a". You'd expect it to be "c".
> --
> Michael Hoffman

ok, thanks michael.

so i am now assuming that either the book's example assumes perl, and
perl is different from python in this regard, or, that the book's
example is faulty.  i understand all the examples given since my
question, and i know what i need to do to make it work.  i am raising
the question because the book says one thing, but the example is not
working for me.  i am searching for the source of the discrepancy.

i will try to research the differences between perl's and python's
regex engines.

thanks again,

sincerely,
proctor




More information about the Python-list mailing list