string.join() vs % and + operators

Darrell news at dorb.com
Fri Apr 2 19:40:26 EST 1999


We have an application where many thousands of small strings are to be
inserted and deleted from a very large buffer.
When you have 100k different small strings to replace in 4meg of text the
string.join seems to work well.
I don't know where the trade off is but you could use re.sub

One problem I still don't know how best to avoid is extreme memory
consumption. When you have these large objects around they can be referenced
from higher level objects and not get destroyed until the program exits. We
should have thought about this from the begining of our project.

def insertDeleteList(inbuf, l):
        """ Insert and delete segments in a buffer
        l is a list of (start, string, end) The input 'l' must be sorted
        If start and end are equal then string is inserted at that point
        If end > start then this range is deleted.
        If end < start then this range is duplicated
        """
        splitBuf=[]
        last=0
        for i in l:
                b=inbuf[last:i[0]]
                splitBuf.append(b)
                splitBuf.append(i[1])
                last=i[2]       # Advance past some buffer here
        splitBuf.append(inbuf[last:])
        return string.join(splitBuf,'')









More information about the Python-list mailing list