Code question

Mark Wooding mdw at distorted.org.uk
Mon Apr 21 19:04:25 EDT 2008


<jyoung79 at kc.rr.com> <jyoung79 at kc.rr.com> wrote:

> If anyone has time, I was wondering if you could share your thoughts
> on whether this is an efficient way to do something like this, if it's
> horrible and slow, etc.

If your lists are fairly short then your algorithm is probably the best
way to do it, or close to.

If I really wanted to optimize it, I'd probably do this:

  * Firstly, preallocate the target rather than repeatedly appending.
    Python's list allocator is fairly clever, but in this case it's easy
    to help it by working out the final length in advance.

  * Secondly, can avoid checking which lists are still `live' in the
    inner loop by building an auxiliary table which contains lengths and
    list indices, and sorting it by increasing lengths.  Then you blat
    out items from all the live lists until the next one is exhausted.

The first tweak will always help, but never really very much.  The
second will win if the lists are long.  The tricky bit is keeping track
of which lists are live.  You can use an auxiliary Python list, but
maintaining it is tricky: you don't want to leave holes that you have to
test for in the main loop, and splicing items out by index is messy
because the indices will keep changing.  I use a doubly-linked
list. which feels strange in Python.

Something (loosely) like the following, maybe.  No, it's not pretty.

def combine(lists):

  ## Initial conditioning; quit if job is trivial.
  N = len(lists)
  if N == 0:
    return []

  ## Create the working list.  They have the form [PREV, NEXT, LIST].
  ## The sentinel acts as list head and tail.
  sentinel = [None, None, -1]
  work = [[sentinel, sentinel, lists[i]] for i in xrange(N)]
  for i in xrange(N):
    if i > 0: work[i][0] = work[i - 1]
    if i < N - 1: work[i][1] = work[i + 1]
  sentinel[0], sentinel[1] = work[-1], work[0]

  ## Remember the sequence of endings.
  finish = [(len(lists[i]), work[i]) for i in xrange(N)]
  finish.sort(key = lambda (n, i): n)

  ## Preallocate the destination.
  dest = [None] * sum([n for (n, i) in finish])

  ## Now do the main work.
  last = 0
  o = 0
  for (n, ww) in finish:
    for j in xrange(last, n):
      w = sentinel[1]
      while w is not sentinel:
        dest[o] = w[2][j]
        o += 1
        w = w[1]
    ww[1][0], ww[0][1] = ww[0], ww[1]
    last = n

  ## Done.
  return dest

-- [mdw]



More information about the Python-list mailing list