f*cking re module
George Sakkis
gsakkis at rutgers.edu
Mon Jul 4 09:01:31 EDT 2005
"jwaixs" <jwaixs at gmail.com> wrote:
> Thank you for your replies, it's much obvious now. I know more what I
> can and can't do with the re module. But is it possible to search for
> more than one string in the same line?
>
> bv. I want to replace the <python> with " "
> </python> with "\n" and every thing that's not between the two python
> tags must begin with "\nprint \"\"\"" and end with "\"\"\"\n"? Or do I
> need more than one call?
You can do it in one call, but it's ugly; as other have told you
already, use HTMLParser or some other parsing package. Now if you
insist...
regex = re.compile(r'''(?:
(?:<python>)
(.*?) # group 1: inside tags
(?:</python>)
) | # OR
([^<]*) # group 2: outside tags
''', re.DOTALL | re.VERBOSE)
def replace(match):
g1,g2 = match.groups()
if g1:
return g1
else:
return '\nprint """%s"""\n' % g2
text = '''this is <python>a stupid
sentence</python> but still I
<python>have to</python> write it.'''
print regex.sub(replace,text)
===== Output ==================
print """this is """
a stupid
sentence
print """ but still I
"""
have to
print """ write it."""
=======================
George
More information about the Python-list
mailing list