Accuracy of multiprocessing.Queue.qsize before any Queue.get invocations?

Tim Chase python.list at tim.thechases.com
Thu May 12 19:07:02 EDT 2022


The documentation says[1]

> Return the approximate size of the queue. Because of
> multithreading/multiprocessing semantics, this number is not
> reliable.

Are there any circumstances under which it *is* reliable?  Most
germane, if I've added a bunch of items to the Queue, but not yet
launched any processes removing those items from the Queue, does
Queue.qsize accurately (and reliably) reflect the number of items in
the queue?

  q = Queue()
  for fname in os.listdir():
    q.put(fname)
  file_count = q.qsize() # is this reliable?
  # since this hasn't yet started fiddling with it
  for _ in range(os.cpu_count()):
    Process(target=myfunc, args=(q, arg2, arg3)).start()

I'm currently tracking the count as I add them to my Queue,

  file_count = 0
  for fname in os.listdir():
    q.put(fname)
    file_count += 1

but if .qsize is reliably accurate before anything has a chance to
.get data from it, I'd prefer to tidy the code by removing the
redunant counting code if I can.

I'm just not sure what circumstances the "this number is not
reliable" holds.  I get that things might be asynchronously
added/removed once processes are running, but is there anything that
would cause unreliability *before* other processes/consumers run?

Thanks,

-tkc

[1]
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Queue.qsize







More information about the Python-list mailing list