O(n^2) is bad - can it be fixed?

Helen Dawson helend at accessone.com
Fri May 25 01:25:50 EDT 2001


After discoverting that lists actually are O(n^2), and after
hearing a rumour that they behaved badly on Win9x, I decided
to investigate.

It turns out that my previous tests had stopped just before the
quadratic nature of lists starts showing up. In my new test script
(pasted in at the bottom of this message) I append two million
items to a list. On NT you can see the performance steadily
degrade - each set of 10,000 items takes noticeably longer to
add, however it does finish. On Win9x the performance is
constant, until a wall is hit and the performance plummets.

A bit later, on Win9x, with about 1.5 million items in the list,
the program halts with an out of memory error. This happens
despite the fact that Python only has a few tens of Mbytes
allocated on my 128 Mbyte machine. It turns out that Python
actually runs out of address space! 2 GBytes of address space
swallowed up in a few minutes.

Here's what happens. The Python list keeps adding 100 items at
a time. Since Python uses realloc these arrays can grow without
being copied, which is nice and efficient. However, the default heap
size on Win9x is 4Mbytes. When the list doesn't fit then it starts
getting copied. Each time it gets copied the OS (or the C run-time)
allocates a heap that is big enough - barely. So, when the list grows
a few more times (64K probably?) it fills another heap. So, suddenly
the list is being frequently copied.

Worse yet, the old heaps don't get freed. There memory mostly
does, but the address space doesn't. I don't know if that's a Python
glitch or a C run-time glitch, but pretty soon the address space is
filled with about 500 4 Mbyte heaps, and the script halts. The
line of increasing large heaps that are all mostly empty is quite
amusing in my little memory viewer.


NT handles it better, but the performance still degrades.


The current behaviour is to add 10 items initially, and then
switch to adding 100 items. This strikes me as being a weak
approximation to proper geometric expansion, and I'm not sure
what the benefit's are. Increasing by some fraction of the
current size would be much more reliable. Instead of this:

static int
roundupsize(int n)
{
        if (n < 500)
                return ROUNDUP(n, 10);
        else
                return ROUNDUP(n, 100);
}

why not this:

static int
roundupsize(int n)
{
        if (n < 500)
                return ROUNDUP(n, 10);
        else
                return n + n / 10;
}

This method will avoid the quadratic behaviour, it will avoid
running out of address space on systems with imperfect
allocators, and it will waste, at most, 10% of memory. If
10% is too much to waste, change the /10 to /20.

The current system can run a hundred times slower than it
needs to, and can fail when it needn't. Is there any good
reason not to make the fix? Any program that adds more
than 1000 items to a list in small batches will benefit from
this fix, and there is virtually no downside.

Bruce/Helen Dawson

Alex Martelli wrote:

> "Helen Dawson" <helend at accessone.com> wrote in message
> news:3B0CAF2C.D744807A at accessone.com...
> > Thanks for all the replies. I learned several tricks.
> >
> > However I find myself more worried than ever. I started out
> > thinking that list was guaranteed to be O(n) for my problem,
> > but now it appears that it isn't - it's dependent on good realloc
> > behaviour. I think that's one thing the STL guys recognized
> > well - incrementing by any (more or less) constant size
> > can produce arbitrarily bad performance. A worst case
> > memory penalty of 2x for doubling the size of list each time
> > seems a fair price to pay for avoiding that
>
> Just one more reflection... often you can get a sensible
> estimate of how big your list will need to be, beforehand,
> particularly if you're willing to tolerate a 'slop' of a
> factor of 2 or so.  In this case, you can get substantially
> better performance by "preallocating" the list rather than
> using .append() on it.  If you wrap this into a UserList
> subclass, or your own sequence object, it's not too hard
> to keep client code blissfully unaware of the issue - it
> calls .append() anyway, and the append method implementation
> is the place where "the smarts" hide.
>
> It's unlikely that you'll get an actual performance boost
> from this approach, but you might get peace of mind:-).
>
> Alex

print """This program demonstrates that Python is very efficient when
it comes to appending to arrays - it takes linear time as you add more
items. However if you keep appending to a list the performance can
get arbitrarily bad - it is O(n^2) which means that appending ten
times as many characters takes one hundred times longer. Lists behave
much better than strings, but still fail unacceptably much earlier
than they need to.
"""

print "Written by Bruce Dawson"
import time

import array
def arrayappendtest(myarray, count=100):
    starttime = time.clock()
    for o in range(count):
        for i in range(10000):
            myarray.append(" ")
        endtime = time.clock()
        print "Array iteration %d/%d - time %f" % (o, count,
                                        endtime - starttime)
        starttime = endtime
    return myarray

def appendtest(stringorlist, count=100):
    starttime = time.clock()
    for o in range(count):
        for i in range(10000):
            stringorlist += " "
        endtime = time.clock()
        print "List iteration %d/%d - time %f" % (o, count,
                                        endtime - starttime)
        starttime = endtime
    return stringorlist

print "Appending 150,000 items to a list works fine, but even over"
print "a small number of iterations"
print "you can see that the time to add an item is not constant -"
print "it's proportional to the number of elements in the list."
print "Therefore, the time to add n items is O(n^2) - if n is"
print "reasonably large."
x = appendtest([], 15)
raw_input("Notice that lists slow down - press enter to continue")
print "Arrays take constant time to add items."
x = arrayappendtest(array.array('c'), 15)
raw_input("Notice arrays do not slow down - press enter to continue")
print "Since arrays scale well we can do lots of iterations quickly"
x = arrayappendtest(array.array('c'), 200)
raw_input("Notice arrays are still fast - press enter to continue")
print "This runs out of address space (not memory, address space)"
print "on Win9x. It ends up reserving a huge number of heaps, that"
print "are mostly empty."
print "It typically fails when the outer loop reaches around 290, at"
print "which point all 2 GBytes of user address space is reserved."
print "The performance gets very bad first."
print "A modest amount of memory is in use when the program fails."
x = appendtest([], 200)
raw_input("Success! - you're not running Win9x")






More information about the Python-list mailing list