[Python-Dev] collections module

Tim Peters tim.one at comcast.net
Sun Jan 11 21:42:15 EST 2004


[Tim]
>> Neither case holds for a queue; the question that started this was
>> whether to provide a fancy queue implementation in C, and (of
>> course) Guido would rather see Python lists efficiently usable for
>> that purpose too.  I suspected that might not be a realistic hope,
>> and now I'm starting to *think* it too <wink>.

[Raymond Hettinger]
> It needn't be especially fancy.  Here is what I had in mind:
>
> n = 50
> class fifo(object):
>

This is a different issue.  There are many ways to *use* lists to build a
distinct fifo class.  What I've been writing about is the possibility of
making Python lists *directly* efficient for use as queues (the realistic
possibility of making list.append and list.pop(0) both at worst O(1)
amortized time).  That's what I remain skeptical about.

>     def __init__(self):
> 	  # block[0:n] is for data and block[n] is a link field
>         self.head = self.tail = [None] * (n+1)

That *looks* so much like a common newbie coding error that it deserves a
comment.

>         self.headptr = self.tailptr = 0
>
>     def push(self, x):
>         if self.headptr == n:
>             newblock = [None] * (n+1)
>             self.head[n] = newblock
>             self.head = newblock
>             self.headptr = 0
>         self.head[self.headptr] = x
>         self.headptr += 1

Please don't push at "the head" -- this reverses the common (and sensible)
meanings of "head" and "tail" for queues.  You join a queue at its tail, and
leave the queue when you reach its head.

>     def pop(self):
>         if self.tail is self.head and self.tailptr >= self.headptr:
>             raise IndexError
>         if self.tailptr == n:
>             self.tail = self.tail[n]
>             self.tailptr = 0
>         x = self.tail[self.tailptr]
>         self.tailptr += 1
>         return x
>
> The memory manager is called once every for 50 queue insertions and
> memory is freed after every 50 pops.  Outside of that, every push and
> pop just a fast array access and pointer increment.  Long queues incur
> about a 15-20% space overhead as compared to a straight-list
> implementation.  Speedwise, this beats the socks off of the current
> approach.

Meaning pairing append() and pop(0)?  Sure -- it would be hard not to beat
that <wink>.  You can save more by ensuring the blocks are small enough for
pymalloc to satisfy the memory requests, more again by keeping a dedicated
freelist of nodes, and more again by switching to a circular implementation
when the queue appears to have reached steady-state.  You might also look at
the implementation of the Icon language, where the native list type is
implemented similarly as a linked list of blocks, and supports efficient
general deque operation.

Note that Marc-Andre's mxQueue implements a queue as a circular dynamic
array; when the queue grows at an enormous rate, he has to endure some
reallocs and moving, but in steady-state it can run forever without any
memory-management operations.  I'll note it looks like an mxQueue never
shrinks, though, which may or may not be "a problem", depending on the app.

As to "fancy", Marc-Andre ended up with over 1,000 lines of C, and his
implementation takes a simpler approach than yours.  OTOH, it added lots of
marginal features (like "q << x" adds x to q).

I've never needed a queue in Python where "the standard" two-list trick (as
shown in Josiah's Cookbook comment) wasn't more than good enough, so my
interest in this is really limited to whether Python's native list type can
(realistically) be made efficient for general deque operation.




More information about the Python-Dev mailing list