deleting texts between patterns

Baoqiu Cui cbaoqiu at yahoo.com
Sun Jun 4 12:51:03 EDT 2006


John Machin <sjmachin at lexicon.net> writes:

> Uh-oh.
>
> Try this:
>
>>>> pat = re.compile('(?<=abc\n).*?(?=xyz\n)', re.DOTALL)
>>>> re.sub(pat, '', linestr)
> 'blahfubarabc\nxyz\nxyzzy'

This regexp still has a problem.  It may remove the lines between two
lines like 'aaabc' and 'xxxyz' (and also removes the first two 'x's in
'xxxyz').

The following regexp works better:

  pattern = re.compile('(?<=^abc\n).*?(?=^xyz\n)', re.DOTALL | re.MULTILINE)

>>> lines = '''line1
... abc
... line2
... xyz
... line3
... aaabc
... line4
... xxxyz
... line5'''
>>> pattern = re.compile('(?<=^abc\n).*?(?=^xyz\n)', re.DOTALL | re.MULTILINE)
>>> print pattern.sub('', lines)
line1
abc
xyz
line3
aaabc
line4
xxxyz
line5
>>> 

- Baoqiu

-- 
Baoqiu Cui <cbaoqiu at yahoo.com>



More information about the Python-list mailing list