RE Question

Jorge Godoy godoy at ieee.org
Thu Aug 18 10:44:58 EDT 2005


Yoav wrote:

> I don't understand why the two REs produce a different result. I read
> the RE guide but still I can't seem to figure it out.
> 
>  >>> t
> 'echo user=name password=pass path="/ret files"\r\n'
>  >>> re.findall(r'(?<=\s)[^=]+=((?:".*")|(?:\S*))(?=\s)', t)
> ['name', 'pass', '"/ret files"']
>  >>> re.findall(r'(?<=\s)[^=]+=((".*")|(\S*))(?=\s)', t)
> [('name', '', 'name'), ('pass', '', 'pass'), ('"/ret files"', '"/ret
> files"', '')]

Hi Yoav.

You can see at "sre" documentation (use that instead of the docs for "re")
that using "?:" you're asking for a non-groupping version of the
parenthesis match.  When you use parenthesis, the matched expression is
saved for using later.  Using "?:" you prevent that.  This is what causes
the difference for you.

> Also, does '|' char (meaning or) produces a pair for each section. I
> don't understand how it works. Can someone please direct me to a place
> which will explain it?

I didn't get your second question.  When you use "|" the first match is
used.  It is a short-circuited version of "or".  I mean, it tries the first
regexp, if it matches, the second expression is ignored.  The same is true
for "and", except that the comparisons end on the first false result. 


Be seeing you,
-- 
Jorge Godoy      <godoy at ieee.org>




More information about the Python-list mailing list