String Concatenation O(n^2) (was: Re: Explaining Implementing a Binary Search Tree.)
Ian Kelly
ian.g.kelly at gmail.com
Mon Jun 16 12:41:05 EDT 2008
On Mon, Jun 16, 2008 at 4:34 AM, Bart Kastermans <bkasterm at gmail.com> wrote:
> This is interesting. I had never attempted to verify a big O
> statement
> before, and decided that it would be worth trying. So I wrote some
> code to
> collect data, and I can't find that it goes quadratic. I have the
> graph
> at
>
> http://kasterma.wordpress.com/2008/06/16/complexity-of-string-concatenation/
>
> It looks piecewise linear to me.
I don't think there's any question that it's quadratic. Just look at
the C code, and you'll see that every time you concatenate the entire
string has to be copied. Remember that the copy code uses memcpy from
the C library, though, so it's quite fast. If you're not seeing it
for relatively small strings, it's probably that the major time
component is the Python looping construct, not the copy operation.
I tried it at lengths of multiples of 10, and it remained linear up
until 1E6. At 1E7, it appears to turn quadratic. I tried 1E8, but it
hasn't finished yet. I expect it to take the better part of an hour.
>>> from timeit import Timer
>>> a = Timer("a += 'a'", "a = ''")
>>> a.timeit(10)
6.9841280492255464e-006
>>> a.timeit(100)
2.7936511287407484e-005
>>> a.timeit(1000)
0.00043525084856810281
>>> a.timeit(10000)
0.0049266038004134316
>>> a.timeit(100000)
0.046758456731367914
>>> a.timeit(1000000)
0.38496261396358022
>>> a.timeit(10000000)
20.320074199374631
Even though it doesn't become quadratic until 1E7, it's still up to an
order of magnitude faster to use str.join for smaller strings, though:
>>> b = Timer("''.join(['a'] * 1000000)")
>>> b.timeit(1)
0.044094989726346512
-Ian
More information about the Python-list
mailing list