Sanitising arguments to shell commands (was: Waiting for a subprocess to exit)

Fri Aug 21 05:08:08 EDT 2009

Miles Kaufmann <milesck at umich.edu> writes:

> I would recommend avoiding shell=True whenever possible. It's used in
> the examples, I suspect, to ease the transition from the functions
> being replaced, but all it takes is for a filename or some other input
> to unexpectedly contain whitespace or a metacharacter and your script
> will stop working--or worse, do damage (cf. the iTunes 2 installer
> debacle[1]).

Agreed, and that's my motivation for learning about ‘subprocess.Popen’.

> Leaving shell=False makes scripts more secure and robust; besides,
> when I'm putting together a command and its arguments, it's as
> convenient to build a list (['mycmd', 'myarg']) as it is a string (if
> not more so).

Which leads to another issue:

I'm modifying a program that gets its child process command arguments
from three places:

* hard-coded text within the program (e.g. the command name, and
  context-specific arguments for the specific operation to be performed)

* user-customised options to be added to the command line

* filenames from the program's own command line

For the hard-coded argument text, obviously they can simply be
hard-coded as list elements::

    command_args = ["foo", "--bar"]

The filenames to be processed can also be appended one item per
filename.

However, the user-customised options are specified by the user in a
configuration file, as a single string argument::

    [fooprogram]
    additional_args = --baz 'crunch cronch' --wobble

This works fine if the command line is constructed by dumb string
concatenation; but obviously it fails when I try to construct a list of
command line arguments.

It's quite reasonable for the user to expect to be able to put any
partial shell command-line in that string option and expect it will be
processed by the shell, including any quoting or other escaping.

How can I take a string that is intended to be part of a command line,
representing multiple arguments and the shell's own escape characters as
in the above example, and end up with a sane command argument list for
‘subprocess.Popen’?

E.g.::

    parser = optparse.OptionParser()
    (options_args) = parser.parse_args(argv[1:])
    filenames = args

    config = configparser.ConfigParser()
    config.read([system_config_file_path, user_config_file_path])
    user_configured_args = config.get('fooprogram', 'additional_args')

    command_args = ["foo", "--bar"]
    somehow_append_each_argument(command_args, user_configured_args)
    command_args.extend(filenames)

    command_process = subprocess.Popen(command_args, shell=False)

The resulting ‘command_args’ list should be::

    ["foo", "--bar",
     "--baz", "crunch cronch", "--wobble",
     "spam.txt", "beans.txt"]

How can I write the ‘somehow_append_each_argument’ step to get that
result?

-- 
 \                “Every sentence I utter must be understood not as an |
  `\                      affirmation, but as a question.” —Niels Bohr |
_o__)                                                                  |
Ben Finney