Friday Finking: 'main-lines' are best kept short

Fri Sep 13 19:02:22 EDT 2019

On 13Sep2019 15:58, DL Neil <PythonList at DancesWithMice.info> wrote:
>Is it a good idea to keep a system's main-line* code as short as 
>possible, essentially consigning all of 'the action' to application and 
>external packages and modules?

Generally yes.

>* my choice of term: "main-line", may be taken to mean:
>- the contents of main(),
>- the 'then clause' of an if __name__ == __main__: construct,
>- a __main__.py script.

Taking these out of order:

I don't like "if __name__ == '__main__':" to be more than a few lines.  
If it gets past about 4 or 5 then I rip it out into a main() function 
and use:

  if __name__ == '__main__':
    sys.exit(main(sys.argv))

and put "def main(argv):" at the top of the module (where it is 
glaringly obvious).

Once at that stage, where you have a __main__.py or a "def main()" is 
based _entirely_ on whether this is a module or a package. There is no 
other criterion for me.

[... snip ...]
>Doesn't the author thus suggest that the script (main-line of the 
>program) should be seen as non-importable?

__main__.py is generally something you would never import, any more than 
you would want to import the _body_ of a main() function.  Particularly 
because it will run things that have side effects; a normal import 
should not.

>Doesn't he also suggest that the script not contain anything that 
>might be re-usable?

That is a very similar statement, or at least tightly tied in. If you 
can't import __main__.py because it actually runs the main programme, 
then you can't import it to make use of resuable things. Therefore 
reusable things should not have their definitions in __main__.py.

>Accordingly, the script calls packages/modules which are both 
>importable and re-usable.
>
>None of which discounts the possibility of having other 'main-lines' 
>to execute sub-components of the (total) application, should that be 
>appropriate.
>
>An issue with 'main-line' scripts is that they can become difficult to 
>test - or to build, using TDD and pytest (speaking personally). Pytest 
>is great for unit tests, and can be used for integration testing, but 
>the 'higher up' the testing pyramid we go, the less effectual it 
>becomes (please don't shoot me, pytest is still an indispensable 
>tool!) Accordingly, if 'the action' is pushed up/out to modules, this 
>will ease the testing, by access and by context!

Yes. So ideally your "main" should be fairly skeletal, calling out to 
components defined elsewhere.

>To round things out, I seem to be structuring projects as:
>
>.projectV2
>-- README
>-- LICENSE
>-- docs (sub-directory)
>-- .git (sub-directory)
>-- etc
>-- __main__.py
[...]

I don't have a top level __main__.py in the project source tree; I 
_hate_ having python scripts in the top level because they leak into the 
import namespace courtesy of Python's sys.path including the current 
directory. __main__.py belongs in the package, and that is down a level 
(or so) from the main source tree.

[...]
>Part of making the top-level "projectV2" directory almost-irrelevant in 
>day-to-day dev-work is that __main__.py contains very little, typically 
>three stages:
>	1 config (including start logging, etc, as appropriate)
>	2 create the applications central/action object
>	3 terminate
>
>Nary an if __name__ == __main__ in sight (per my last "Wednesday 
>Wondering"), because "the plan" says there is zero likelihood of the 
>"main-line" being treated as a (re-usable) module! (and any 
>refactoring would, in any case, involve pushing such code out to a 
>(re-usable) module!

As alluded to earlier, the "if __main__ == '__main__':" is entirely an 
idiom to support main-programme semantics in a module. In a package you 
have a __main__.py and no need for the idiom.

>When it comes to execution, the command (excluding any 
>switches/options) becomes:
>
>	[~/Projects]$ python3 projectV2

And there's your use case for the top level __main__.py. I prefer:

  python3 -m projectv2

where the projectv2 package is found via the sys.path.

In production projectv2 would be installed somewhere sensible, and in 
development I've a little "dev" shell function which presumes it is in 
the project top level and sets $PATH, $PYTHPATH etc to allow "python3 -m 
projectv2" to find the package. So in dev I go:

  dev python3 -m projectv2

The advantage here is that if I don't prefix things with "dev" I get the 
official installed projectv2 (whatever that means - it couldeasily be my 
personal ~/bin etc), and with the "dev" prefix I get the version in my 
development tree. So that I don'trun the dev stuff by accident (which is 
one reason I eschew the virtualenv "activate" script - my command line 
environment should not be using the dev environment inless I say so, 
because "dev" might be broken^Wincompatible).

>Which would also distinguish between project-versions, if relevant. 
>More importantly, changes to application version numbers do not 
>require any changes to import statements! (and when users don't wish 
>to be expected to remember version numbers "as well", use symlinks - 
>just as we do with python/python2/python3/python3.7...
>
>Note that it has become unnecessary to add the -m switch!

The -m switch is my friend. It says "obey the sys.path", so that I can 
control things courtesy of the sys.path/$PYTHONPATH.

Cheers,
Cameron Simpson <cs at cskk.id.au>