Regex Question

Frank Koshti frank.koshti at gmail.com
Sat Aug 18 16:18:52 EDT 2012


On Aug 18, 12:22 pm, Jussi Piitulainen <jpiit... at ling.helsinki.fi>
wrote:
> Frank Koshti writes:
> > not always placed in HTML, and even in HTML, they may appear in
> > strange places, such as <h1 $foo(x=3)>Hello</h1>. My specific issue
> > is I need to match, process and replace $foo(x=3), knowing that
> > (x=3) is optional, and the token might appear simply as $foo.
>
> > To do this, I decided to use:
>
> > re.compile('\$\w*\(?.*?\)').findall(mystring)
>
> > the issue with this is it doesn't match $foo by itself, and requires
> > there to be () at the end.
>
> Adding a ? after the meant-to-be-optional expression would let the
> regex engine know what you want. You can also separate the mandatory
> and the optional part in the regex to receive pairs as matches. The
> test program below prints this:
>
> >$foo()$foo(bar=3)$$$foo($)$foo($bar(v=0))etc</htm
>
> ('$foo', '')
> ('$foo', '(bar=3)')
> ('$foo', '($)')
> ('$foo', '')
> ('$bar', '(v=0)')
>
> Here is the program:
>
> import re
>
> def grab(text):
>     p = re.compile(r'([$]\w+)([(][^()]+[)])?')
>     return re.findall(p, text)
>
> def test(html):
>     print(html)
>     for hit in grab(html):
>         print(hit)
>
> if __name__ == '__main__':
>     test('>$foo()$foo(bar=3)$$$foo($)$foo($bar(v=0))etc</htm')

You read my mind. I didn't even know that's possible. Thank you-



More information about the Python-list mailing list