text processing problem

Maurice LING mauriceling at acm.org
Thu Apr 7 21:10:24 EDT 2005


Matt wrote:
> I'd HIGHLY suggest purchasing the excellent <a
> href="http://www.oreilly.com/catalog/regex2/index.html">Mastering
> Regular Expressions</a> by Jeff Friedl.  Although it's mostly geared
> towards Perl, it will answer all your questions about regular
> expressions.  If you're going to work with regexs, this is a must-have.
> 
> That being said, here's what the new regular expression should be with
> a bit of instruction (in the spirit of teaching someone to fish after
> giving them a fish ;-)   )
> 
> my_expr = re.compile(r'(\w+)\s*(\(\1\))')
> 
> Note the "\s*", in place of the single space " ".  The "\s" means "any
> whitespace character (equivalent to [ \t\n\r\f\v]).  The "*" following
> it means "0 or more occurances".  So this will now match:
> 
> "there  (there)"
> "there (there)"
> "there(there)"
> "there                                          (there)"
> "there\t(there)" (tab)
> "there\t\t\t\t\t\t\t\t\t\t\t\t(there)"
> etc.
> 
> Hope that's helpful.  Pick up the book!
> 
> M@
> 

Thanks again. I've read a number of tutorials on regular expressions but 
it's something that I hardly used in the past, so gone far too rusty.

Before my post, I've tried
my_expr = re.compile(r'(\w+) \s* (\(\1\))') instead but it doesn't work, 
so I'm a bit stumped......

Thanks again,
Maurice



More information about the Python-list mailing list