[Chicago] capturing output from subprocesses

Jess Balint jbalint at gmail.com
Fri Nov 11 21:13:42 CET 2005


What you are seeing here is the stdio lib buffering. When the output
is a terminal, you will get line buffering. Since we are using a pipe,
it will switch to block buffering. To get around this in a Python
program, you can use the '-u' command line option or set the env var
PYTHONUNBUFFERED (both documented in man python). This is the
equivalent of calling setbuf(1, NULL).

The non-blocking io is not relevant to any of this.

Jess

On 11/11/05, Noel Thomas Taylor <nttaylor at uchicago.edu> wrote:
>
> Hi Jess,
>
> I've been experimenting with the code you sent, in which the child kills
> itself, and its output is captured by the parent.
>
> But I have not been successful in having the parent kill the child in
> mid-run while still capturing whatever output it had produced up to that
> point.
>
> I added a timeout parameter to the select call. If the timeout happens
> before the child appears in the lists returned by "select", the parent
> kills the child. When I look at the results afterward, however, nothing
> has been sent to the output stream.
>
> I tried making the output and error streams non-blocking with 'fcntl'.
> I've seen this kind of thing in a lot of code I've found on the internet
> where people are discussing this kind of topic, but it's never helped my
> cause so far. I'm I barking up the wrong tree with this?
>
> Please take a look at my code. I've attached 'my_child_process.py' which
> is just the script you sent me before plus the above modifications, and
> 'ticker.py' which prints the word 'tick' once a second for 10 seconds.
>
> In an ideal world, I'd be able to set the timeout for say, five seconds,
> and run the program. The program would then timeout after the child had
> produced five "ticks", and I'd be able to capture those "ticks" and still
> kill the child.
>
> What do you think? Can this be done? Does 'fcntl' have anything to do
> with this problem? You sounded optimistic about it before, so I'm still
> holding out hope.
>
> with great thanks,
>
> Noel Taylor
>
> On Thu, 10 Nov 2005, Jess Balint wrote:
>
> > That's a valid concern. Initially there will be an increase in memory.
> > On most systems, this is pretty efficient and all parent process
> > memory is copy-on-write in the child. However, since you exec() so
> > soon the memory will be replaced by that of the image of the other
> > program. So in effect the end result when you run the other program is
> > no more memory than system() (or possibly even less due to no /bin/sh
> > overhead if it's not a shell script). How many processes do predict
> > you will be running at one time?
> >
> > Jess
> >
> > On 11/10/05, Noel Thomas Taylor <nttaylor at uchicago.edu> wrote:
> >>
> >> Hi Jess,
> >>
> >> Thanks for this great example. I've been experimenting with it, and it
> >> could be the answer to my prayers. One question about forking: I know that
> >> when you do a fork() call, the code gets duplicated in memory, the child
> >> gets its own pid as the return value of the fork, and the parent gets
> >> nothing.
> >>
> >> But how much code gets duplicated, and can a single fork call
> >> significantly impact your memory? In your "child_process.py" for
> >> example, does the whole module get duplicated? If this function were
> >> just one in a giant file thousands of lines long, would that whole file
> >> get duplicated? If your code has a call to fork() in it, does that mean
> >> you should isolate it into a smaller module which you then import, or does
> >> that make a difference?
> >>
> >> Or maybe the duplication is virtual and the two processes are really
> >> occupying the same memory space?
> >>
> >> I'm sorry I can't make the meeting tonight.
> >>
> >> with thanks,
> >>
> >> Noel Taylor
> >>
> >>
> >>
> >> On Tue, 8 Nov 2005, Jess Balint wrote:
> >>
> >>> I made a prototype you can use. It's a simple combination of creating
> >>> pipe()s for stdin and stderr, then dup2 them into the streams after a
> >>> fork(), before an exec(). I've attach the python and a test shell
> >>> script (that kills itself). You should be able to see it capture the
> >>> output. (If there is a problem with the attachments, I will put them
> >>> on a web site or something.)
> >>>
> >>> Jess
> >>>
> >>> On 11/8/05, Ian Bicking <ianb at colorstudy.com> wrote:
> >>>> Noel Thomas Taylor wrote:
> >>>>> Hi Ian,
> >>>>>
> >>>>> I could try that, but in the case of the real application whose output I
> >>>>> want to capture, I have no control over how much output it produces.
> >>>>
> >>>> I thought it would be an interesting test to understand exactly what is
> >>>> going on, even if it isn't exactly what you are expecting to receive.
> >>>>
> >>>>> Do you have any thoughts about recapturing the output of an aborted child
> >>>>> process before the memory that is buffering its output gets blown away?
> >>>>
> >>>> Since it is OS buffering, shouldn't the OS handle that for you?  I don't
> >>>> know; I would suggest starting with a test, then finding something that
> >>>> passes that test.
> >>>>
> >>>> --
> >>>> Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org
> >>>> _______________________________________________
> >>>> Chicago mailing list
> >>>> Chicago at python.org
> >>>> http://mail.python.org/mailman/listinfo/chicago
> >>>>
> >>>
> >> _______________________________________________
> >> Chicago mailing list
> >> Chicago at python.org
> >> http://mail.python.org/mailman/listinfo/chicago
> >>
> > _______________________________________________
> > Chicago mailing list
> > Chicago at python.org
> > http://mail.python.org/mailman/listinfo/chicago
> >
>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago
>
>
>
>


More information about the Chicago mailing list