[Import-SIG] Idea: Autorun functionality for Python modules (redux)

Mon May 22 22:02:34 EDT 2017

(Note: Posting to import-sig as this isn't something I'm actively
planning to pursue myself any time soon, but I want to ensure we don't
accidentally block this possibility while working on the proposal to
make it possible to run extension modules as Python scripts. )

PEP 299 is an old rejected PEP proposing a special "__main__()"
function for Python modules: https://www.python.org/dev/peps/pep-0299/

Three main points were cited in its rejection:

- the name clash with "import __main__"
- the lack of a clear strategy for supporting both newer versions of
Python that supported automatic execution of a suitably named function
as well as older versions that required an "if __name__ ==
'__main__':" block
- the status quo wasn't seen as particular broken and "it would be
more familiar to C/C++ programmers" wasn't a compelling argument for
adding a second way to do it

In an email discussion with Brandon Rhodes a few months back, he
lamented the apparent intransigence of the core developers on this
front, and I pointed out that nobody had ever actually made a
follow-up proposal that specifically addressed the rationale applied
in rejecting PEP 299, and put together a sketch of what such a
proposal might look like.

The first two technical points can be handled by:

1. Using `__run__` as the special function name
2. Setting "__main__.__autorun__ = True" prior to main module
execution, and allowing a module to delete it or set `__autorun__ =
False` to turn off the default autorun behaviour

With those two special attributes defined, the autorun protocol would be:

    if getattr(main_module, "__autorun__", False):
        try:
            runmain = main_module.__run__
        except AttributeError:
            pass
        else:
            import sys
            sys.exit(runmain(sys.argv)

Scripts that want to optionally invoke "__run__" explicitly for
compatibility with older Python versions can then check "__autorun__"
to see whether or not they need to start the application themselves:

    if __name__ == "__main__" and not globals().get("__autorun__"):
        import sys
        sys.exit(__run__(sys.argv))

With the technical objections handled, we can then ask what concrete
benefits a `def __run__(argv):` model might offer over the existing
`if name == "__main__":` model:

    def __run__(argv):
        """CLI documentation goes here"""
        return 0

1. __run__ can go at the *top* of the script, rather than at the end,
giving a conventional "CLI function followed by supporting
definitions" structure
2. you gain access to sys.argv without having to import sys (testing &
REPL friendly!)
3. you can set the process return code without having to call sys.exit
(testing & REPL friendly!)
4. you can attach CLI docs to __run__, rather than forcing them into
the module level docstring
4. introspection tools can more readily discover modules that expose a
command line interface
5. command line scripts written this way are automatically easier to
test, since they're written in a functional style (argv goes in,
return code comes out)
6. the call-and-response functional structure is also likely to
provide a better long term base building block for interoperable CLI
frameworks

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia