Why does the first loop go wrong with Python3

Tue May 19 09:16:13 EDT 2015

On 19 May 2015 at 13:24, Cecil Westerhof <Cecil at decebal.nl> wrote:
> I have the following code:
>     from __future__     import division, print_function
>
>     import subprocess
>
>     p = subprocess.Popen('ls -l', shell = True, stdout = subprocess.PIPE)
>     for line in iter(p.stdout.readline, ''):
>         print(line.rstrip().decode('utf-8'))
>
>     p = subprocess.Popen('ls -l', shell = True, stdout = subprocess.PIPE)
>     for line in p.stdout.readlines():
>         print(line.rstrip().decode('utf-8'))
>
> This works in Python2. (Both give the same output.)
>
> But when I execute this in Python3, then the first loop is stuck in a
> loop where it continually prints a empty string. The second loop is
> executed correctly in Python3.
>
> In the current case it is not a problem for me, but when the output
> becomes big, the second solution will need more memory. How can I get
> the first version working in Python3?

The problem is that Python 3 carefully distinguishes between the bytes
that come when reading from the stdout of a process and text which
must be decoded from the bytes. You're using iter(f, sentinel) and
checking for a sentinel value of ''. However in Python 3 the sentinel
returned will be b''.

Consider:
$ python3
Python 3.2.3 (default, Feb 27 2014, 21:31:18)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> '' == b''
False

If you change it from '' to b'' it will work.

However the normal way to do this is to iterate over stdout directly:

     p = subprocess.Popen('ls -l', shell = True, stdout = subprocess.PIPE)
     for line in p.stdout:
         print(line.rstrip().decode('utf-8'))

--
Oscar