Retrieving the full command line

Steven D'Aprano steve+comp.lang.python at pearwood.info
Wed Jan 23 23:49:13 EST 2013


On Wed, 23 Jan 2013 10:01:24 +0000, Oscar Benjamin wrote:

> On 23 January 2013 03:58, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> On Wed, 23 Jan 2013 00:53:21 +0000, Oscar Benjamin wrote:
>>
>>> On 22 January 2013 23:46, Steven D'Aprano
>>> <steve+comp.lang.python at pearwood.info> wrote: [SNIP]
>>>>
>>> The purpose of the -m option is that you can run a script that is
>>> located via the Python import path instead of an explicit file path.
>>> The idea is that if '/path/to/somewhere' is in sys.path then:
>>>     python -m script arg1 arg2
>>> is equivalent to
>>>     python /path/to/somewhere/script.py arg1 arg2
>>>
>>> If Python didn't modify sys.argv then 'script.py' would need to be
>>> rewritten to understand that sys.argv would be in a different format
>>> when it was invoked using the -m option.
>>
>> I don't think that it would be in a different format. Normally people
>> only care about sys.argv[1:], the actual arguments. argv[0], the name
>> of the script, already comes in multiple formats: absolute or relative
>> paths.
>>
>> Currently, if I have a package __main__.py that prints sys.argv, I get
>> results like this:
>>
>> steve at runes:~$ python3.3 /home/steve/python/testpackage/__main__.py ham
>> spam eggs
>> ['/home/steve/python/testpackage/__main__.py', 'ham', 'spam', 'eggs']
>>
>>
>> which is correct, that's what I gave on the command line. But:
>>
>> steve at runes:~$ python3.3 -m testpackage ham spam eggs
>> ['/home/steve/python/testpackage/__main__.py', 'ham', 'spam', 'eggs']
>>
>>
>> The second example is lying. It should say:
>>
>> ['-m testpackage', 'ham', 'spam', 'eggs']
> 
> I don't know why you would expect this. I imagined that you would want
> 
> ['-m', 'testpackage', 'ham', 'spam', 'eggs']


No. argv[0] is intended to be the script being called, argv[1:] for the 
arguments to the script. Given the two choices:

1) Break every Python script that expects argv[1:] to be the arguments
   to the script, forcing them to decide whether they should look at
   argv[1:] or argv[2:] according to whether or not argv[0] == '-m'; 

or 

2) don't break anything, but make a very small addition to the semantics
   of argv[0] (was: "the path to the script", add "or -m and the name of
   module/package") that won't break anyone's code;


there's practically no choice in the matter.



> If the two were combined into one string I would expect it to at least
> be a valid argument list:
> 
> ['-mtestpackage', 'ham', 'spam', 'eggs']

Okay, fair point. I didn't consider that.

Note however that there is an ambiguity between calling "python -mspam" 
and calling a script literally named "-mspam". But that same ambiguity 
exists in the shell, so I don't consider it a problem. You cannot call a 
script named -mspam unless you use something like this "python ./-mspam".


>> If you are one of the few people who care about argv[0], then you are
>> already dealing with the fact that the name of the executable script is
>> not always an absolute path and therefore can vary greatly from one
>> call to another. Hell, if you are on a system with soft links, the name
>> of the script in the command line is not even necessarily the name of
>> the module. So there's not much more effort involved in dealing with
>> one extra case:
> 
> Unless I've missed something sys.argv[0] is always a valid path to the
> script. Whether it is absolute or not shouldn't matter. 

Sure. But if you care about argv[0] (say, you want to pull out the name 
of the script at runtime, instead of hard-coding it), then you need to be 
aware that you could be given an absolute path, a relative path, a bare 
script name, or the path of a softlink to the file you actually care 
about. Adding one more trivially simple case is not a large burden.

People hardly ever care about argv[0]. At least, I don't think I ever 
have. But the OP does, and Python mangling argv[0] is causing him grief 
because it lies, claiming to have called the __main__.py of his package 
directly when in fact he called it with -m.


> For imported
> modules the path is available from __name__. For a script that is
> executed rather than imported __name__ == "__main__" but the path is
> accessible from sys.argv[0]. If you are one of those people who cares
> about sys.argv[0] then this is probably the value that you wanted it to
> contain.

I'm wary about guessing what people "probably" want, and therefore lying 
about what they actually got. That's DWIM coding, and that almost always 
ends in tears.


> If it were important for sys.argv to show how exactly the script was
> located and executed, then why not also include the 'python3.3' command
> line argument (the real argv[0])? sys.argv emulates the argv that e.g. a
> C program would get. The real command line used is not exactly the same
> since a Python script is not a directly executable binary, so Python
> processes the argument list before passing it through.

Also a good point. To some degree, we're constrained by backwards 
compatibility -- there's only so much change we can do without breaking 
code, and setting argv[0] to the python executable instead of the script 
is too big a change.

In any case, you can get that information using sys.executable, or at 
least you can get the path of the actual Python binary, or you can use 
sys.version (or equivalent) to determine which version of Python you're 
using. 

This does mean that you can't play dirty hacks like some C binaries do, 
where they change their behaviour depending on whether you call them via 
one path or another path. vi does that. But that's hardly a big loss.

Contrariwise, I don't believe that there is currently *any* way to 
distinguish between running a script with or without -m. That should be 
fixed.


-- 
Steven



More information about the Python-list mailing list