New line conversion with Popen attached to a pty

Peter Otten __peter__ at web.de
Thu Jun 20 18:04:39 EDT 2013


jfharden at gmail.com wrote:

> Hi,
> 
> Sorry if this appears twice, I sent it to the mailing list earlier and the
> mail seems to have been swallowed by the black hole of email vagaries.
> 
> We have a class which executes external processes in a controlled
> environment and does "things" specified by the client program with each
> line of output. To do this we have been attaching stdout from the
> subprocess.Popen to a pseudo terminal (pty) made with pty.openempty and
> opened with os.fdopen. I noticed that we kept getting a bunch of extra new
> line characters.

Mixing subprocess and explicit select() looks a bit odd to me. Perhaps you 
should do it completely without subprocess. Did you consider pexpect?

> This is all using python 2.6.4 in a centos6 environment.
> 
> After some investigation I realised we needed to use universal_newline
> support so I enabled it for the Popen and specified the mode in the fdopen
> to be rU. Things still seemed to be coming out wrong so I wrote up a test
> program boiling it down to the simplest cases (which is at the end of this
> message). The output I was testing was this:
> 
> Fake\r\nData\r\n
> as seen through hexdump -C:
> 
>> hexdump -C output.txt
> 00000000  46 61 6b 65 0d 0a 44 61  74 61 0d 0a              |Fake..Data..|
> 0000000c
> 
> Now if I do a simple subprocess.Popen and set the stdout to
> subprocess.PIPE, then do p.stdout.read() I get the correct output of
> 
> Fake\nData\n
> 
> When do the Popen attached to a pty I end up with
> 
> Fake\n\nData\n\n
> 
> Does anyone know why the newline conversion would be incorrect, and what I
> could do to fix it? In fact if anyone even has any pointers to where this
> might be going wrong I'd be very helpful, I've done hours of fiddling with
> this and googling to no avail.
> 
> One liner to generate the test data:
> 
> python -c 'f = open("output.txt", "w"); f.write("Fake\r\nData\r\n");
> f.close()'
> 
> Test script:
> 
> #!/usr/bin/env python2.6.4
> import os
> import pty
> import subprocess
> import select
> import fcntl
> 
> class TestRead(object):
> 
>     def __init__(self):
>         super(TestRead, self).__init__()
>         self.outputPipe()
>         self.outputPty()
> 
>     def outputPipe(self):
>         p1 = subprocess.Popen(
>             ("/bin/cat", "output.txt"),
>             stdout=subprocess.PIPE,
>             universal_newlines=True
>         )
>         print "1: %r" % p1.stdout.read()
> 
>     def outputPty(self):
>         outMaster, outSlave = pty.openpty()
>         fcntl.fcntl(outMaster, fcntl.F_SETFL, os.O_NONBLOCK)
> 
>         p2 = subprocess.Popen(
>             ("/bin/cat", "output.txt"),
>             stdout=outSlave,
>             universal_newlines=True
>         )
> 
>         with os.fdopen(outMaster, 'rU') as pty_stdout:
>             while True:
>                 try:
>                     rfds, _, _ = select.select([pty_stdout], [], [], 0.1)
>                     break
>                 except select.error:
>                     continue
> 
>             for fd in rfds:
>                 buf = pty_stdout.read()
>                 print "2: %r" % buf
> 
> if __name__ == "__main__":
>     t = TestRead()

The "universal newlines" translation happens on the python level whereas the 
subprocesses communicate via OS means (pipes). Your pty gets "\r\n", leaves 
"\r" as is and replaces "\n" with "\r\n". You end up with "\r\r\n" which is 
interpreted by "universal newlines" mode as a Mac newline followed by a DOS 
newline. 

I see two approaches to fix the problem:

(1) Add an intermediate step to change newlines explicitly:

    p = subprocess.Popen(
        ["/bin/cat", "output.txt"],
        stdout=subprocess.PIPE
    )
    q = subprocess.Popen(
#        ["dos2unix"],
        ["python", "-c", "import sys, os; 
sys.stdout.writelines(os.fdopen(sys.stdin.fileno(), 'rU'))"],
        stdin=p.stdout,
        stdout=outSlave)

(2) Fiddle with terminal options, e. g.

    attrs = termios.tcgetattr(outSlave)
    attrs[1]  =  attrs[1] & (~termios.ONLCR) | termios.ONLRET
    termios.tcsetattr(outSlave, termios.TCSANOW, attrs)

    p = subprocess.Popen(
        ("/bin/cat", "output.txt"),
        stdout=outSlave,
    )


Disclaimer: I found this by try-and-error, so it may not be the "proper" 
way.





More information about the Python-list mailing list