[Python-Dev] updated PEP 389: argparse

Steven Bethard steven.bethard at gmail.com
Sat Oct 24 21:10:34 CEST 2009


Sorry for the delay, but I've finally updated PEP 389, the argparse
PEP, based on all the feedback from python-dev. The full PEP is below,
but in short, the important changes are:

* The getopt module will *not* be deprecated.
* In Python 2, the -3 flag will cause deprecation warnings to be
issued for optparse.
* No casually visible deprecation warnings for optparse are expected
until Jun 2013.
* Support for {}-format strings will be added when one of the
converters is approved. This can be done at any point though, so it
shouldn't hold up inclusion of argparse.

Steve

----------------------------------------------------------------------
PEP: 389
Title: argparse - new command line parsing module
Version: $Revision: 75674 $
Last-Modified: $Date: 2009-10-24 12:01:49 -0700 (Sat, 24 Oct 2009) $
Author: Steven Bethard <steven.bethard at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 25-Sep-2009
Python-Version: 2.7 and 3.2
Post-History: 27-Sep-2009, 24-Oct-2009


Abstract
========
This PEP proposes inclusion of the argparse [1]_ module in the Python
standard library in Python 2.7 and 3.2.


Motivation
==========
The argparse module is a command line parsing library which provides
more functionality than the existing command line parsing modules in
the standard library, getopt [2]_ and optparse [3]_. It includes
support for positional arguments (not just options), subcommands,
required options, options syntaxes like "/f" and "+rgb", zero-or-more
and one-or-more style arguments, and many other features the other
two lack.

The argparse module is also already a popular third-party replacement
for these modules. It is used in projects like IPython (the Scipy
Python shell) [4]_, is included in Debian testing and unstable [5]_,
and since 2007 has had various requests for its inclusion in the
standard library [6]_ [7]_ [8]_. This popularity suggests it may be
a valuable addition to the Python libraries.


Why aren't getopt and optparse enough?
======================================
One argument against adding argparse is that thare are "already two
different option parsing modules in the standard library" [9]_. The
following is a list of features provided by argparse but not present
in getopt or optparse:

* While it is true there are two *option* parsing libraries, there
  are no full command line parsing libraries -- both getopt and
  optparse support only options and have no support for positional
  arguments. The argparse module handles both, and as a result, is
  able to generate better help messages, avoiding redundancies like
  the ``usage=`` string usually required by optparse.

* The argparse module values practicality over purity. Thus, argparse
  allows required options and customization of which characters are
  used to identify options, while optparse explicitly states "the
  phrase 'required option' is self-contradictory" and that the option
  syntaxes ``-pf``, ``-file``, ``+f``, ``+rgb``, ``/f`` and ``/file``
  "are not supported by optparse, and they never will be".

* The argparse module allows options to accept a variable number of
  arguments using ``nargs='?'``, ``nargs='*'`` or ``nargs='+'``. The
  optparse module provides an untested recipe for some part of this
  functionality [10]_ but admits that "things get hairy when you want
  an option to take a variable number of arguments."

* The argparse module supports subcommands, where a main command
  line parser dispatches to other command line parsers depending on
  the command line arguments. This is a common pattern in command
  line interfaces, e.g. ``svn co`` and ``svn up``.


Why isn't the functionality just being added to optparse?
=========================================================
Clearly all the above features offer improvements over what is
available through optparse. A reasonable question then is why these
features are not simply provided as patches to optparse, instead of
introducing an entirely new module. In fact, the original development
of argparse intended to do just that, but because of various fairly
constraining design decisions of optparse, this wasn't really
possible. Some of the problems included:

* The optparse module exposes the internals of its parsing algorithm.
  In particular, ``parser.largs`` and ``parser.rargs`` are guaranteed
  to be available to callbacks [11]_. This makes it extremely
  difficult to improve the parsing algorithm as was necessary in
  argparse for proper handling of positional arguments and variable
  length arguments. For example, ``nargs='+'`` in argparse is matched
  using regular expressions and thus has no notion of things like
  ``parser.largs``.

* The optparse extension APIs are extremely complex. For example,
  just to use a simple custom string-to-object conversion function,
  you have to subclass ``Option``, hack class attributes, and then
  specify your custom option type to the parser, like this::

    class MyOption(Option):
        TYPES = Option.TYPES + ("mytype",)
        TYPE_CHECKER = copy(Option.TYPE_CHECKER)
        TYPE_CHECKER["mytype"] = check_mytype
    parser = optparse.OptionParser(option_class=MyOption)
    parser.add_option("-m", type="mytype")

  For comparison, argparse simply allows conversion functions to be
  used as ``type=`` arguments directly, e.g.::

    parser = argparse.ArgumentParser()
    parser.add_option("-m", type=check_mytype)

  But given the baroque customization APIs of optparse, it is unclear
  how such a feature should interact with those APIs, and it is
  quite possible that introducing the simple argparse API would break
  existing custom Option code.

* Both optparse and argparse parse command line arguments and assign
  them as attributes to an object returned by ``parse_args``.
  However, the optparse module guarantees that the ``take_action``
  method of custom actions will always be passed a ``values`` object
  which provides an ``ensure_value`` method [12]_, while the argparse
  module allows attributes to be assigned to any object, e.g.::

    foo_object = ...
    parser.parse_args(namespace=foo_object)
    foo_object.some_attribute_parsed_from_command_line

  Modifying optparse to allow any object to be passed in would be
  difficult because simply passing the ``foo_object`` around instead
  of a ``Values`` instance will break existing custom actions that
  depend on the ``ensure_value`` method.

Because of issues like these, which made it unreasonably difficult
for argparse to stay compatible with the optparse APIs, argparse was
developed as an independent module. Given these issues, merging all
the argparse features into optparse with no backwards
incompatibilities seems unlikely.


Deprecation of optparse
=======================
Because all of optparse's features are available in argparse, the
optparse module will be deprecated. The following very conservative
deprecation strategy will be followed:

* Python 2.7+ and 3.2+ -- The following note will be added to the
  optparse documentation:

    The optparse module is deprecated, and has been replaced by the
    argparse module.

* Python 2.7+ -- If the Python 3 compatibility flag, ``-3``, is
  provided at the command line, then importing optparse will issue a
  DeprecationWarning. Otherwise no warnings will be issued.

* Python 3.2 (estimated Jun 2010) -- Importing optparse will issue
  a PendingDeprecationWarning, which is not displayed by default.

* Python 3.3 (estimated Jan 2012) -- Importing optparse will issue
  a PendingDeprecationWarning, which is not displayed by default.

* Python 3.4 (estimated Jun 2013) -- Importing optparse will issue
  a DeprecationWarning, which *is* displayed by default.

* Python 3.5 (estimated Jan 2015) -- The optparse module will be
  removed.

Though this is slower than the usual deprecation process, with two
releases of PendingDeprecationWarnings instead of the usual one, it
seems prudent to avoid producing any casually visible Deprecation
warnings until Python 3.X has had some additional time to attract
developers.


Updates to getopt documentation
===============================
The getopt module will not be deprecated. However, its documentation
will be updated to point to argparse in a couple of places. At the
top of the module, the following note will be added:

  The getopt module is a parser for command line options whose API
  is designed to be familiar to users of the C getopt function.
  Users who are unfamiliar with the C getopt function or who would
  like to write less code and get better help and error messages
  should consider using the argparse module instead.

Additionally, after the final getopt example, the following note will
be added:

  Note that an equivalent command line interface could be produced
  with less code by using the argparse module::

    import argparse

    if __name__ == '__main__':
        parser = argparse.ArgumentParser()
        parser.add_argument('-o', '--output')
        parser.add_argument('-v', dest='verbose', action='store_true')
        args = parser.parse_args()
        # ... do something with args.output ...
        # ... do something with args.verbose ..


Deferred: string formatting
===========================
The argparse module supports Python from 2.3 up through 3.2 and as a
result relies on traditional ``%(foo)s`` style string formatting. It
has been suggested that it might be better to use the new style
``{foo}`` string formatting [13]_. There was some discussion about
how best to do this for modules in the standard library [14]_ and
several people are developing functions for automatically converting
%-formatting to {}-formatting [15]_ [16]_. When one of these is added
to the standard library, argparse will use them to support both
formatting styles.


Rejected: getopt compatibility methods
======================================
Previously, when this PEP was suggesting the deprecation of getopt
as well as optparse, there was some talk of adding a method like::

  ArgumentParser.add_getopt_arguments(options[, long_options])

However, this method will not be added for a number of reasons:

* The getopt module is not being deprecated, so there is less need.
* This method would not actually ease the transition for any getopt
  users who were already maintaining usage messages, because the API
  above gives no way of adding help messages to the arguments.
* Some users of getopt consider it very important that only a single
  function call is necessary. The API above does not satisfy this
  requirement because both ``ArgumentParser()`` and ``parse_args()``
  must also be called.


Out of Scope: Various Feature Requests
======================================
Several feature requests for argparse were made in the discussion of
this PEP:

* Support argument defaults from environment variables
* Support argument defaults from configuration files
* Support "foo --help subcommand" in addition to the currently
  supported "foo subcommand --help"

These are all reasonable feature requests for the argparse module,
but are out of the scope of this PEP, and have been redirected to
the argparse issue tracker.


Discussion: sys.err and sys.exit
================================
There were some concerns that argparse by default always writes to
``sys.err`` and always calls ``sys.exit`` when invalid arguments are
provided. This is the desired behavior for the vast majority of
argparse use cases which revolve around simple command line
interfaces. However, in some cases, it may be desirable to keep
argparse from exiting, or to have it write its messages to something
other than ``sys.err``. These use cases can be supported by
subclassing ``ArgumentParser`` and overriding the ``exit`` or
``_print_message`` methods. The latter is an undocumented
implementation detail, but could be officially exposed if this turns
out to be a common need.


References
==========
.. [1] argparse
   (http://code.google.com/p/argparse/)

.. [2] getopt
   (http://docs.python.org/library/getopt.html)

.. [3] optparse
   (http://docs.python.org/library/optparse.html)

.. [4] argparse in IPython
   (http://mail.scipy.org/pipermail/ipython-dev/2009-April/005102.html)

.. [5] argparse in Debian
   (http://packages.debian.org/search?keywords=python-argparse)

.. [6] 2007-01-03 request for argparse in the standard library
   (http://mail.python.org/pipermail/python-list/2007-January/592646.html)

.. [7] 2009-06-09 request for argparse in the standard library
   (http://bugs.python.org/issue6247)

.. [8] 2009-09-10 request for argparse in the standard library
   (http://mail.python.org/pipermail/stdlib-sig/2009-September/000342.html)

.. [9] Fredrik Lundh response to [6]_
   (http://mail.python.org/pipermail/python-list/2007-January/592675.html)

.. [10] optparse variable args
   (http://docs.python.org/library/optparse.html#callback-example-6-variable-arguments)

.. [11] parser.largs and parser.rargs
   (http://docs.python.org/library/optparse.html#how-callbacks-are-called)

.. [12] take_action values argument
   (http://docs.python.org/library/optparse.html#adding-new-actions)

.. [13] use {}-formatting instead of %-formatting
   (http://bugs.python.org/msg89279)

.. [14] transitioning from % to {} formatting
   (http://mail.python.org/pipermail/python-dev/2009-September/092326.html)

.. [15] Vinay Sajip's %-to-{} converter
   (http://gist.github.com/200936)

.. [16] Benjamin Peterson's %-to-{} converter
   (http://bazaar.launchpad.net/~gutworth/+junk/mod2format/files)


Copyright
=========
This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:


More information about the Python-Dev mailing list