managing stdout and stderr neatly while doing parallel processing

Andrei D. dreico at wanadoo.fr
Tue Aug 5 16:30:04 EDT 2003


Hello Python newsgroup,

In the process of developing a big ssh wrapper for sending commands to
multiple hosts over the last few months, I (almost accidentally, considering
I'm really just an "amateur hacker" :-) was very pleased to discover at one
stage how to run processes in parallel using python, which is powerful
technology to say the least, applicable not only in my project but in lots
of other areas as well.

Anyway, what I wanted to ask was about managing the output of stderr as well
as stdout, using select.select and your common garden os.popen in this case.

This is the script that will define my problem (which is really in another
context altogether, but just to keep the explanation/background short and
sweet for now):

[0] user1/scripts/python> cat parallel7.py
#!/usr/bin/python

import os
import select
import string

def keyboard_interrupt():
    print "<<<<< Keyboard Interrupt ! >>>>>\n"
    os._exit(1)

def getCommand(count):
    return "echo %i: ; ls kjfdjfkd ; ls -l parallel7.py" % (count)

def main():
    readPipes=[]
    for count in range(1,6):
        readPipes.append(os.popen(getCommand(count)))
    while 1:
        try:
            # Could put a timeout here if we had something else to do
            readable,writable,errors=select.select(readPipes,[],[])
            for p in readable:
                print p.read()
                readPipes.remove(p)
                # os.wait() # Don't want zombies
            if len(readPipes)==0:
                break
        except KeyboardInterrupt: print keyboard_interrupt()
if __name__=="__main__":
  main()

So ... the basic problem is that the response from 'ls kjfdjkfd' is not
thrown out in the 'right order' ... observe:

[0] user1/scripts/python> ./parallel7.py
ls: kjfdjfkd: No such file or directory
ls: kjfdjfkd: No such file or directory
ls: kjfdjfkd: No such file or directory
1:
-rwxr-xr-x    1 user1    user1         814 Aug  5 22:10 parallel7.py

2:
-rwxr-xr-x    1 user1    user1         814 Aug  5 22:10 parallel7.py

ls: kjfdjfkd: No such file or directory
4:
-rwxr-xr-x    1 user1    user1         814 Aug  5 22:10 parallel7.py

5:
-rwxr-xr-x    1 user1    user1         814 Aug  5 22:10 parallel7.py

ls: kjfdjfkd: No such file or directory
3:
-rwxr-xr-x    1 user1    user1         814 Aug  5 22:10 parallel7.py

[0] user1/scripts/python>

In fact stdout in stages 1 to 5 isn't even necessarily thrown out in the
correct order either, but I'll tackle that separately at another time
(unless it's of direct relevance here?). I guess my question is really: how
do you handle the different elements i.e.

readable,writable,errors=select.select(readPipes,[],[])

in order to get an ordered output of errors as well, like you'd obviously
get doing a loop in the shell like so (even though this is of course a
sequential / not parallel operation):

[0] user1/scripts/python> for i in `seq 1 5` ; do echo $i ; ls sdfdskjsdj ;
ls -l parallel7.py ; done
1
ls: sdfdskjsdj: No such file or directory
-rwxr-xr-x    1 user1    user1         814 Aug  5 22:10 parallel7.py
2
ls: sdfdskjsdj: No such file or directory
-rwxr-xr-x    1 user1    user1         814 Aug  5 22:10 parallel7.py
3
ls: sdfdskjsdj: No such file or directory
-rwxr-xr-x    1 user1    user1         814 Aug  5 22:10 parallel7.py
4
ls: sdfdskjsdj: No such file or directory
-rwxr-xr-x    1 user1    user1         814 Aug  5 22:10 parallel7.py
5
ls: sdfdskjsdj: No such file or directory
-rwxr-xr-x    1 user1    user1         814 Aug  5 22:10 parallel7.py

Any ideas or comments on this or any related issues would be much
appreciated. Perhaps pexpect will help? I suspect it may well do ...

Thanks,

A.







More information about the Python-list mailing list