efficient idiomatic queue?

Mon Jan 14 23:19:05 EST 2002

In article <mailman.1011063021.30100.python-list at python.org>,
 "Tim Peters" <tim.one at home.com> wrote:

> It seems that your only objection to your original idea was that it's "very
> inefficient for long lists".  Why?  That is, if you're only trying to get
> across the idea of the algorithm, speed doesn't seem relevant ("assume these
> primitives take constant time; then here's the interesting result: ...").
> If you're actually worried about efficiency implications of details, I'd
> expect you'd be keen to teach about those too.

It's an algorithms design and analysis class.  If I'm going to tell the 
students that some algorithm takes O(n log n) time, it would be helpful if 
the code I show them actually takes O(n log n) instead of something slower 
that one could make efficient only with more detailed knowledge of Python's 
implementation details.

I realize implementation speed is not the primary concern of Python 
language designers, but for Python programmers I think Python's relatively 
slow speeds make it especially important to be careful about asymptotics: 
there's less slack for using a bad algorithm and letting today's gigahertz 
computers hide your inefficiencies.

> BTW, is it necessary to destory the input lists?  Rather than remove the
> element at the front of a list, most people <wink> would just increment an
> index vrbl.  Then all mutation occurs only via .append() on the output list,
> and your efficiency worry goes away.
> 
> saving-an-index-vrbl-isn't-worth-much-thought-ly y'rs  - tim

List destruction is not necessary.  The index variable approach is 
probably best in this example, since merge sort doesn't need the full 
power of a queue (each list has all its appends before it is scanned). I 
was just hoping that Python's powerful list processing primitives would 
let me simplify the code by avoiding those extra index variables.  And a 
queue is such a basic data structure that I thought surely Python already 
allowed it to be implemented efficiently with its built-in list 
operations.

Actually, the merge step in merge sort is a great example for simple 
generators, or would be if not for the difficulty of keeping track of the 
first item of each input iterator:

def merge(A,B):
        try:
                Afirst=A.next()
                Aempty=0
        except StopIteration:
                Aempty=1
        try:
                Bfirst=B.next()
                Bempty=0
        except StopIteration:
                Bempty=1
        while not Aempty and not Bempty:
                if Afirst < Bfirst:
                        yield Afirst
                        try:
                                Afirst=A.next()
                        except StopIteration:
                                Aempty=1
                else:
                        yield Bfirst
                        try:
                                Bfirst=B.next()
                        except StopIteration:
                                Bempty=1
        if not Aempty:
                yield Afirst
                while 1:
                        yield A.next()
        if not Bempty:
                yield Bfirst
                while 1:
                        yield B.next()

def mergesort(L):
        if len(L) < 2:
                return iter(L)
        else:
                return merge(mergesort(L[:len(L)//2]),
                             mergesort(L[len(L)//2:]))

This (like the closely related heap sort) has the advantage that if you 
start sorting n items, but only end up looking at the first k items of 
sorted output, it takes time O(n + k log n), so the time is nonlinear only 
when k is large. But this is way too much code for a blackboard, and the 
students are confused enough by slices...
-- 
David Eppstein       UC Irvine Dept. of Information & Computer Science
eppstein at ics.uci.edu http://www.ics.uci.edu/~eppstein/