sorting 1172026 entries

Chris Angelico rosuav at gmail.com
Sun May 6 21:02:57 EDT 2012


On Mon, May 7, 2012 at 10:31 AM, Cameron Simpson <cs at zip.com.au> wrote:
> I didn't mean per .append() call (which I'd expect to be O(n) for large
> n), I meant overall for the completed list.
>
> Don't the realloc()s make it O(n^2) overall for large n? The list
> must get copied when the underlying space fills. I confess to being
> surprised at how quick it went for me though. I suppose reallocing in
> chunks (eg to double the available size) might conceal this for a while.
> It should still be well over O(n) overall (not amortised per .append()
> call).

I haven't checked the CPython code, but the usual way to do
reallocations is to double the size (or add 50% or something) each
time, meaning that as n increases, the frequency of reallocations
decreases - hence the O(1) amortized time. In any case, it's the
recommended way to build large strings:

s = ""
while condition:
    s += "some_value"

versus:

s = []
while condition:
    s.append("some_value")
s = "".join(s)

so I would be very much surprised if this common wisdom is merely
replacing one O(n*n) operation with another O(n*n) operation.

ChrisA



More information about the Python-list mailing list