doing hundreds of re.subs efficiently on large strings

nihilo exnihilo at NOmyrealCAPSbox.com
Tue Mar 25 16:46:04 EST 2003


I have a simple application that converts a file in one format to a file 
in another format using string.replace and re.sub. There are about 150 
replacements occuring on files that are on the order of a few KB.  Right 
now, I represent the file contents as a string, and naively use 
string.replace and re.sub to replace all the substrings that I need to 
replace. I knew that I would have to come up with a faster, more memory 
friendly solution eventually, but I just wanted to implement the basic 
idea as simply as possible at first.

Now I am at a loss as to how to proceed. I need a stringbuffer 
equivalent that supports at least re.sub. Two alternatives that I have 
seen for StringBuffer are using lists and then joining into a string at 
the end, or using an array. Neither of these seems to be of much use to 
me, since I am doing more than just appending to the end of the string. 
Is there another pre-existing alternative that I'm overlooking? Or has 
anybody come up with a good solution for this issue?

tia,

nihilo





More information about the Python-list mailing list