String concatenation - which is the fastest way ?

Steven D'Aprano steve+comp.lang.python at pearwood.info
Wed Aug 10 10:36:59 EDT 2011


przemolicc at poczta.fm wrote:

> Hello,
> 
> I'd like to write a python (2.6/2.7) script which connects to database,
> fetches hundreds of thousands of rows, concat them (basically: create XML)
> and then put the result into another table. Do I have any choice
> regarding string concatenation in Python from the performance point of
> view ? Since the number of rows is big I'd like to use the fastest
> possible library (if there is any choice). Can you recommend me something
> ?

For fast string concatenation, you should use the string.join method:

substrings = ['a', 'bb', 'ccc', 'dddd']
body = ''.join(substrings)

Using string addition in a loop, like this:

# Don't do this!
body = ''
for sub in substrings:
    body += sub

risks being *extremely* slow for large numbers of substrings. (To be
technical, string addition can O(N**2), while ''.join is O(N).) This
depends on many factors, including the operating system's memory
management, and the Python version and implementation, so repeated addition
may be fast on one machine and slow on another. Better to always use join,
which is consistently fast.

You should limit string addition to small numbers of substrings:

result = head + body + tail  # This is okay.


-- 
Steven




More information about the Python-list mailing list