How to replace multiple-line text

Alex Martelli aleax at aleax.it
Thu Jul 11 02:53:10 EDT 2002


David Lees wrote:

> I would like to process a code file and substitute one multiline block
> of code for another.  I know how to do this in Python by brute force
> scanning through on a line by line basis until the first line of the
> pattern is found then looping over the rest of the lines in the target
> pattern for a match, then substituting in an output string multiline
> substitution.  But I am sure there is a neater solution, perhaps using
> regular expressions.  Could someone point me towards sample code or
> something similar that I could modify.

The simplest approach, a step-by-step example for clarity:

old_lines = """Tyger, tyger, burning bright
in the forests of the night,
what immortal hand or eye
dare frame thy fearful symmetry?"""

new_lines = """Do not go gentle into that good night,
Old age should burn and rave at close of day;
Rage, rage against the dying of the light."""

old_file = open("oldfile.txt")
old_text = old_file.read()
old_file.close()

new_text = old_text.replace(old_lines, new_lines)
# other changes to new_text, if needed, go here

new_file = open("oldfile.txt")
new_file.write(new_text)
new_file.close()


The only possible problem with this approach is that if the files
involved are truly huge you might not have space in memory for the
text involved (about twice as much as the file's size plus a little).

For a "code file" this is quite unlikely -- how many megabytes of
code will there be even in a large such file, after all?  So, in
practice, for the need you've expressed, this approach is almost
surely the best one.


The theoretical problem in terms of substituting substreams of
input streams on the fly is also interesting, but I doubt it's of
much actual applicability to your case.

Not sure what regular expressions would have to do with the case.
I see it as a case of bunching.  Matching multiline text, as long
as you manage to get it in memory, is just as easy with string
operations as with re's if it's an exact match you're lookin for --
re's would be useful if you needed more sophisticated matching,
but they still need the file's contents to be in memory.


Alex




More information about the Python-list mailing list