[Tutor] Multiprocessing

Cameron Simpson cs at cskk.id.au
Tue Sep 22 00:03:25 EDT 2020


On 21Sep2020 21:19, Stephen M Smith <stephen.m.smith at comcast.net> wrote:
>I am trying to figure out how multiprocessing works for an application I
>have.  My methodology is to vigorously read everything I can find on the
>subject including examples. I have found an example and modified it a bit.
>Here is the code and it works:
[...]
>print("starting up")
>
>def do_something(seconds):
>    print(f'Sleeping {seconds} second(s)')
>    time.sleep(1)
>    return f"done sleeping...{seconds} seconds"
>
>
>if __name__ == '__main__':
>    print("about to go")
>    freeze_support()
>    # Process(target=do_something(1)).start
>    with concurrent.futures.ProcessPoolExecutor() as executor:
>        # f1 = executor.submit(do_something,1)
>        secs = [10,9,8,7,6,5,4,3,2,1]
>        results = [executor.submit(do_something, sec) for sec in secs]
>        for f in concurrent.futures.as_completed(results):
>            print(f.result())
>
>What I don't get is the output, also pasted below. As you will see the
>"starting up" message is displayed 17 times. I cannot begin to understand
>how that statement is executed more than once and 17 times makes no sense
>either. Thanks in advance for any insight you can provide.

Are you sure the above text is _exactly_ what you ran to get that 
output? Let me explain why this question:

I suspect that the sys.stdout buffer has not been flushed. If 
ProcessPoolExecutor works using the fork() OS call, each child is a 
clone of the parent process, including its in-memory state.

Since the sys.stdout buffer contains "starting up\n", each child process 
has such a buffer, which goes to the output when sys.stdout flushes its 
buffer.

My concern about the code is that I would also expect to see the "about 
to go" output also replicated. And I do not.

I would also not expect this behaviour if the output is going to a tty 
instead of a file, because sys.stdout is line buffered going to a tty, 
so I'd expect _interactively_ this to not happen to you.

However, if stdout is going to a file, it will normally be block 
buffered, which means the buffer is not actually written out until the 
buffer fills, or until stdout is shutdown when the programme (or child 
programme) finishes.

So when block buffered, each child will have a copy of the unflushed 
buffer waiting to go out, so you get a copy from the main programme and 
once for each child.

Try adding:

    sys.stdout.flush()

after the print statements before you start the ProcessPoolExecutor and 
see if the output changes.

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Tutor mailing list