[Python-Dev] collections module

Tim Peters tim.one at comcast.net
Fri Jan 9 16:07:04 EST 2004


[Josiah Carlson]
> There's been a python-based FIFO that is O(1) on {push, pop} and uses
> lists in the ASPN Python cookbook (adding len is easy):
> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/210459
> You'd probably be more interested in the variant I posted as a
> comment; it is significantly faster when the output queue is
> exhausted.

Yup!  Yours is the nicest way to build an efficient (but thread-unsafe)
queue in Python.

> Speaking of which...the Queue module REALLY should be re-done with the
> Fifo above

Maybe; it probably doesn't matter; serious threaded apps pass maxsize to
Queue, generally equal to the # of worker threads (or a small constant
multiple thereof), and the bounded queue does double-duty then as a
flow-control throttle too; list.pop(0) isn't really slow when len(list) is
that small.

> and a single conditional,

Sorry, I don't know what that might mean.

> the multiple mutexes in the current Queue module are ugly as
> hell, and very difficult to understand.

self.mutex is dead easy to understand.  esema and fsema would be clearer if
renamed to not_empty and not_full (respectively), and clearer still (but
slower) if they were changed to threading.Conditions (which didn't exist at
the time Queue was written).  Regardless, there are 3 of them for good
reasons, although that may be beyond the understanding you've reached so
far:  mutex has to do with threading correctness, and esema and fsema have
to do with threading efficiency via reducing lock contention (so long as the
queue isn't empty, a would-be get'er can acquire esema instantly, and
doesn't care about fsema regardless; so long as the queue isn't full, a
would-be put'er can acquire fsema instantly, and doesn't care about esema
regardless; and in both of those "instantly" means "maybe" <wink>).

Note that you cannot, for example, start each method by acquiring
self.mutex, and have that be the only lock.  If you try that, then, e.g.,
someone doing a get on an emtpy queue-- and so needing to block --will hold
on to self.mutex forever, preventing any other thread from adding seomthing
to the queue (and the whole shebang deadlocks).

> ...
> It all really depends on the overhead of the underlying mutexes or
> conditionals, which can be tested.

I still don't know what "conditionals" means to you (you're counting "if"
statements, perhaps?), but mucking with mutexes isn't cheap.  You don't want
to bother with locks unless thread-safety is an important issue.




More information about the Python-Dev mailing list