queue versus list

Cameron Simpson cs at cskk.id.au
Thu Mar 19 21:10:33 EDT 2020


On 20Mar2020 00:37, duncan smith <duncan at invalid.invalid> wrote:
>On 19/03/2020 20:40, MRAB wrote:
>> On 2020-03-19 20:08, duncan smith wrote:
>>>        I have generator code along the following lines,
>>>
>>> Q = queue.Queue()
>>> Q.put(x)
>>> while not Q.empty():
>>>      x = Q.get()
>>>      if <some condition that depends on x>:
>>>          yield x
>>>      else:
>>>        Q.put(<some value that depends on x>)
>>>
>>> If I change it to,
>>>
>>> Q = []
>>> Q.append(x)
>>> for x in Q:
>>>      if <some condition that depends on x>:
>>>          yield x
>>>      else:
>>>        Q.append(<some value that depends on x>)
>>>
>>> then it runs approximately 3 times more quickly.

A list is a very simple data structure, and therefore fast (for things 
that are inherently fast, like append; insert is expensive).

A queue takes a lock (cheap but an overhead) and might be expensive for 
get or put (this dependings on the data structure storing the queue 
items). A circular list (essentially a list using a base offset) can be 
fast until it needs resizing, a doubly linked list of nodes is always 
O(1) but is more expensive to add/remove (allocate node, adjust 
pointers); makye 3 times as slow :-) I haven't looked to see how Queue 
stores its items.

>>>I seem to remember
>>> (from somewhere) that the language specification doesn't guarantee that
>>> the latter will continue to work, but that it's unlikely that future
>>> implementations will break it.
[...snip...]
>> [MRAB] 1. I'm not sure it's a good idea to mutate a list while 
>> iterating over it.
>
>AFAICR that's the bit that depends on the implementation of lists. As
>long as iteration involves incrementing an index and indexing into the
>list it will work.

If you're the only person using the list, as you are only _appending_ to 
it then the for loop is probably safe.

If you were inserting elements into the list there would be trouble (an 
iterator basicly needs keep an index as you suggest below, and an insert 
at or before the current index invalidates that, causing the iterator to 
issue the same element again (for example).

>>> Apart from that, is there any good reason
>>> why I shouldn't use the list? Is there another approach that might avoid
>>> the overhead of using a queue? Results from profiling the two
>>> implementations below in case it makes anything clearer. TIA.

Using a list seems fine for me if your code is as simple in reality as 
listed above.

>> 2. The example with the list retains all of the results, whereas the 
>> one with the queue consumes them.
>>
>> 3. Queues are thread-safe because they're used for communication between
>> threads, but lists aren't thread-safe; that might be something to do
>> with the difference.
>>
>> 4. If it doesn't need to be thread-safe, why not try deques instead?
>
>Bingo. Performance is indistinguishable from that of the list.

A deque is implement using a list.

>Thread
>safety is unimportant (for my purposes), but the docs at
>https://docs.python.org/2/library/collections.html#collections.deque
>state "Deques support thread-safe, memory efficient appends and pops
>from either side of the deque with approximately the same O(1)
>performance in either direction". Looking at the output from the
>profiler for my implementation with the queue it looks like a deque is
>being used but there's various other stuff happening relating to thread
>safety. Is the lesson from this that if all you're doing is appending
>and popping, then use a deque (rather than a queue) even if you want
>thread safety? (It makes quite a difference in performance for my use
>case.) Cheers.

A deque will change the ordering of the elements you process. Your list 
loop behaves in FIFO order, as does the queue. A dequeue (you you're 
doing a heappush and a heappop) will issue you the lowest current 
element on each iteration. If you merely need to process all the 
elements you're good. If the order matters you need to consider that.

A deque, being thread safe, will be using a lock. There will be a (quite 
small) overhead for that.

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Python-list mailing list