[Python-Dev] Re: subprocess - updated popen5 module

Jason Lunz lunz at falooley.org
Mon Oct 11 16:57:07 CEST 2004


astrand at lysator.liu.se said:
> One goal with subprocess is being able to write cross-platform
> applications. For example, it should be possible to open up www.python.org
> in Mozilla. The best way to write this is:
>
> subprocess.call(["mozilla", "http://www.python.org"])
>
> In this case, the list form is translated to the string form when running
> on Windows.

understood. I personally have a definite need for this in my programs,
so I know what's involved.

> There's a risk that UNIX users might expect UNIX shell-like quoting
> support rather than the MSC one, though.

exactly. I just see it confusing people. If someone wants simple string
handling on unix, then shell=True. problem solved. If they need the type
of portability referred to with your mozilla example, then they should
use the list form and let windows do list2cmdline() on it.

The converse case of using a string for windows and using cmdline2list()
on it for unix is a good try, but there's no guarantee that the actual
windows subprocess will even do the MSVCRT-compatible-cmdline2list()
conversion. Which leaves you in a very confusing situation indeed.

>> I can see that it's trying to be symmetric and orthogonal, but I don't
>> think that the result is worth it in this case. In what scenario is the
>> use of cmdline2list() really useful?
>
> I don't really have a good example.
>
> If we should remove the cmdline2list stuff, what should happen if the
> users passes a string on UNIX? Do you prefer:
>
> 1) Run through the shell (automatic shell=True).
> or
> 2) A ValueError raised.
>
> I guess alternative 1 is most intuitive. That would line up with
> popen2 as well.

Use of the shell should be explicit, not automatic, because of the usual
shell metacharacter security concerns. unix programmers used to doing
os.system('ls -l') will quickly learn that the subprocess way of doing
the same is subprocess.call('ls -l', shell=True). This has the added
benifit of making it obvious exactly what's happening.

I don't think that the only alternative to number 1) is to raise a ValueError.

What do you think of the below patch? Just listify bare strings on unix.  This
does exactly what it should when the string actually references a binary, and
gives a meaningful error when it doesn't. Even if the filename has strange
characters of some kind.

  $ echo '#! /bin/sh' > 'ls -l'
  $ echo 'echo foo' >> 'ls -l'
  $ cat 'ls -l'
  #! /bin/sh
  echo foo
  $ python
  Python 2.3.4 (#2, Sep 24 2004, 08:39:09)
  >>> from subprocess import call
  >>> call('./ls -l')
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "subprocess.py", line 441, in call
      return Popen(*args, **kwargs).wait()
    File "subprocess.py", line 524, in __init__
      errread, errwrite)
    File "subprocess.py", line 942, in _execute_child
      raise child_exception
  OSError: [Errno 13] Permission denied
  >>>
  $ chmod +x 'ls -l'
  $ python
  Python 2.3.4 (#2, Sep 24 2004, 08:39:09)
  >>> from subprocess import call
  >>> call('./ls -l')
  foo
  0
  >>> call('./ls -blah')
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "subprocess.py", line 441, in call
      return Popen(*args, **kwargs).wait()
    File "subprocess.py", line 524, in __init__
      errread, errwrite)
    File "subprocess.py", line 942, in _execute_child
      raise child_exception
  OSError: [Errno 2] No such file or directory

Nice, straightforward, easy to understand. And look how much code is removed --
I haven't even cleaned up the corresponding docs yet.

Index: subprocess.py
===================================================================
RCS file: /cvsroot/python-popen5/popen5/subprocess.py,v
retrieving revision 1.15
diff -u -p -r1.15 subprocess.py
--- subprocess.py	9 Oct 2004 10:11:06 -0000	1.15
+++ subprocess.py	11 Oct 2004 14:56:26 -0000
@@ -854,11 +854,11 @@ class Popen:
                            errread, errwrite):
             """Execute program (POSIX version)"""
 
+            if isinstance(args, types.StringTypes):
+                args = [args]
+
             if shell:
                 args = ["/bin/sh", "-c"] + args
-            else:
-                if isinstance(args, types.StringTypes):
-                    args = self.cmdline2list(args)
 
             if executable == None:
                 executable = args[0]
@@ -1051,93 +1051,6 @@ class Popen:
 
             self.wait()
             return (stdout, stderr)
-
-
-    def cmdline2list(cmdline):
-        """
-        Translate a command line string into a list of arguments, using
-        using the same rules as the MS C runtime:
-
-        1) Arguments are delimited by white space, which is either a
-           space or a tab.
-        
-        2) A string surrounded by double quotation marks is
-           interpreted as a single argument, regardless of white space
-           contained within.  A quoted string can be embedded in an
-           argument.
-    
-        3) A double quotation mark preceded by a backslash is
-           interpreted as a literal double quotation mark.
-
-        4) Backslashes are interpreted literally, unless they
-           immediately precede a double quotation mark.
-
-        5) If backslashes immediately precede a double quotation mark,
-           every pair of backslashes is interpreted as a literal
-           backslash.  If the number of backslashes is odd, the last
-           backslash escapes the next double quotation mark as
-           described in rule 3.
-        """
-
-        # See
-        # http://msdn.microsoft.com/library/en-us/vccelng/htm/progs_12.asp
-
-        # Step 1: Translate all literal quotes into QUOTE.  Justify number
-        # of backspaces before quotes.
-        tokens = []
-        bs_buf = ""
-        QUOTE = 1 # \", literal quote
-        for c in cmdline:
-            if c == '\\':
-                bs_buf += c
-            elif c == '"' and bs_buf:
-                # A quote preceded by some number of backslashes.
-                num_bs = len(bs_buf)
-                tokens.extend(["\\"] * (num_bs//2))
-                bs_buf = ""
-                if num_bs % 2:
-                    # Odd.  Quote should be placed literally in array
-                    tokens.append(QUOTE)
-                else:
-                    # Even.  This quote serves as a string delimiter
-                    tokens.append('"')
-
-            else:
-                # Normal character (or quote without any preceding
-                # backslashes)
-                if bs_buf:
-                    # We have backspaces in buffer.  Output these.
-                    tokens.extend(list(bs_buf))
-                    bs_buf = ""
-
-                tokens.append(c)
-
-        # Step 2: split into arguments
-        result = [] # Array of strings
-        quoted = False
-        arg = [] # Current argument
-        tokens.append(" ")
-        for c in tokens:
-            if c == '"':
-                # Toggle quote status
-                quoted = not quoted
-            elif c == QUOTE:
-                arg.append('"')
-            elif c in (' ', '\t'):
-                if quoted:
-                    arg.append(c)
-                else:
-                    # End of argument.  Output, if anything.
-                    if arg:
-                        result.append(''.join(arg))
-                        arg = []
-            else:
-                # Normal character
-                arg.append(c)
-
-        return result
-
-    cmdline2list = staticmethod(cmdline2list)
 
 
     def list2cmdline(seq):



More information about the Python-Dev mailing list