re question

Sat Oct 16 21:52:55 EDT 1999

"Max M. Stalnaker" <stalnaker at acm.org> writes:

> I have the following code:
> 
>  def subset(self):
>   group=re.search(r"%%%([^%]+)%%%",self.data)
>   self.data=group.groups(0)[0]
> 
> Essentially, I get a html page, change some tags to %%% and extract the
> stuff between.  But the way I do it above fails if the stuff between has a
> single %.  The main goal is to extract the stuff.  The changing the tags is
> just the way I tried and had sometime success.
> 
> Maybe there is a better way to do this.  Or someone could perhaps suggest re
> code that would do it.  Thank you.
> 
> My current idea is to construct a single character sentinel out of something
> greater than chr(128) and use that.  This will probably work in this
> application, but I feel like I am missing something.
> 
> --
> Max M. Stalnaker  mailto:stalnaker at acm.org  http://www.astarcc.com

To match everything between two occurances of %%%, including embedded
%, this expression will work:

get = re.compile(r'%%%(.*?)%%%')

The important part is the *?, which matches the smallest possible
string rather than the largest.

--
Tim Evans