Newbie questions on import & cmd line run

Thu May 17 00:33:15 EDT 2012

On Wed, 16 May 2012 18:45:39 -0700, gwhite wrote:

> #! <what is supposed to go here?>
> # Filename: newbie00.py

"Supposed to"? Nothing -- it is completely optional.

#! ("hash-bang") lines currently do nothing on Windows machines, they are 
just comments. However, on Unix and Linux machines (and Macintosh?) they 
are interpreted by the shell (equivalent to cmd.exe or command.com), in 
order to tell the shell what interpreter to use to execute the program if 
you run it directly. It is common to use something like:

#!/usr/bin/env python

but I stress that this is completely optional, and doesn't do anything on 
Windows.

> if __name__ == '__main__':
>     print 'This program was called from the \
> system command line.'
>     print __name__ + '.py'
> else:
>     print 'This program was imported on the \
> Python command line.'
>     print __name__ + '.py'
> 
> -----------------
> 
> If I run from the system (win cmd) command, I get:
> 
> C:\engineer\engruser\python>python  newbie00.py
> 
> This program was called from the system command line. __main__.py

The magic variable "__name__" is special in Python. When you run a Python 
module as a script from the command line, it gets set to "__main__". Note 
that there is no such file "__main__.py" (unless you have happened to 
create one yourself).

When you import a module, rather than run it, __name__ gets set to the 
actual filename of the module, minus the file extension.

> -----------------
> If I hit the run button in Sypder, I get (in the iPython command
> console):
> 
> In [70]: runfile(r'C:\engineer\engruser\python\newbie00.py', wdir=r'C:
> \engineer\engruser\python')
> This program was called from the system command line. __main__.py

I'm not sure what Spyder is. Is it part of iPython? You may need to 
consult the iPython or Spyder docs to find out exactly what tricks it 
plays in its interactive console.

> -----------------
> If I import on the iPython command, I get:
> 
> In [71]: import newbie00
> This program was imported on the Python command line. newbie00.py

In this case, __name__ is set to the module name (the file name less the 
file extension). Your script adds the .py at the end.

> -----------------
> If I import *again* on the iPython command, I get:
> 
> In [72]: import newbie00
> 
> In [73]:
> 
> <nothing that I can see>

That is correct. Python modules are only executed *once*, the first time 
they are imported. From then on, additional imports refer back to a 
cached module object.

When you say "import newbie00", the (highly simplified!) process is this:

* Python looks in sys.modules for the name "newbie00". If it finds
  something in the cache (usually a module object), it fetches that thing
  and assigns it to the variable "newbie00", and the import process is
  complete.

* But if it doesn't find anything in the cache, Python searches the
  locations listed in sys.path for a module, package, or library. That
  could mean any of:

  - newbie00.py   (source code)
  - newbie00.pyc  (compiled byte-code)
  - newbie00.pyo  (compiled optimized byte-code)
  - newbie00.pyw  (Windows only)
  - newbie00.dll  (Windows only C library)
  - newbie00.so   (Linux and Unix C library)

  as well as others (e.g. packages).

* If no module is found, Python raises an error, otherwise the first
  found module is used.

* If a compiled module (e.g. newbie00.pyc) is found, and is no older than
  the source code (newbie00.py), then Python uses the pre-compiled file.

  (If the compiled module is older than the source module, it is ignored.)

* Otherwise Python parses the newbie00.py source code, compiles it to
  byte-code, and writes it to the file newbie00.pyc so that the next time
  the import will be faster.

* At this point, Python now has a compiled chunk of Python byte-code. 
  It then sets the special global variable __name__ to the file name (less
  extension), and executes that code.

* If no fatal error occurs, Python now bundles the results of the
  executed code (any functions, classes, variables, etc.) into a module
  object, stores the module object in the cache sys.modules for next time,
  and finally assigns it to the name "newbie00".

There is a *lot* that goes on the first time you import a module, which 
is why Python tries really hard to avoid running modules unless you 
explicitly ask it to. So "import newbie00" only *executes* the code once 
per Python session. Subsequent imports use the cached version.

The process is quite different when you run a Python module as a script. 
In this case, the .pyc file (if any) is ignored, the script is parsed and 
compiled from scratch every single time, the magic variable __name__ is 
set to "__main__", and the script is executed every single time.

[...]
> Some questions:
> 
> 1.  If running from the system command line, or the Sypder "run" button,
> "__name__" is "__main__" rather than "newbie00", as seen above.
> 
> So, how would I get the file name newbie00.py in these two noted cases? 

The special variable "__file__" is set to the filename:

>>> import string
>>> string.__file__
'/usr/lib/python2.6/string.pyc'

Note that __file__ will usually be set to the full path of the file. To 
extract just the file name:

>>> import os
>>> os.path.basename(string.__file__)
'string.pyc'

To ignore the file extension, use this:

>>> os.path.splitext(os.path.basename(string.__file__))
('string', '.pyc')

[...]
> 2.  In python, there seems to be a distinction between running something
> as if it is a system command of "C:\...>python myPyFile.py" compared to
> simply entering that same ".py" file name directly on the python console
> command line.  In fact, the latter does not work unless the somewhat
> lengthy ">>> runfile(r'C:\... wdir=r'C:\...) stuff is entered (in
> iPython).  (I mean, my old MATLAB habit of simply entering ">>
> mfilename" on the command line seems to be quite wrong in python.)

That is correct. The Python interactive interpreter is not intended to be 
a full-blown shell. iPython is, so I wouldn't be surprised if there is a 
shorter version of the runfile(full pathname) stuff, but I don't know 
what it is.

> Is there a shortened syntax of running a .py from the python command
> prompt, if not using a Spyder "run" button?  Or should I always run as
> if from the system prompt?  That is, dispense with the MATLAB-like "run
> from MATLAB/python command line" bias I may be holding.

Normally I would say "always run it from the system prompt", but that may 
be because I don't know iPython and/or Spyder.

Another alternative is to write a "main function" in your script:

# Filename newbie01.py
def main():
    print("Doing stuff here...")

if __name__ == '__main__':
    main()
else:
    print("importing here")

Then you can do this:

>>> import newbie01
importing here
>>> newbie01.main()
Doing stuff here...
>>> newbie01.main()
Doing stuff here...

That would be my preference. Your mileage may vary.

> 3.  In injecting my old MATLAB bias of running via the command line ">>
> mfilename", I tried a tweak of  ">>>import newbie00".  That "sort of"
> worked, but only the first time.
> 
> Why did the subsequent run of ">>>import newbie00" print nothing?  I'm
> just trying to understand how python works.

Because modules are executed only the first time they are imported. See 
above.

> 4.  The final case shown of hitting the Spyder run button included this:
> 
> UMD has deleted: newbie00
> 
> What does that mean?  I noted that after this "automatic" deletion, I
> could do the ">>>import newbie00" once again and get the print.  (I did
> not show that above.)

That looks like a trick specific to Spyder and/or iPython.

> 5.  I think #4 implies an import can be removed.  (Yes/No?)  I am not
> sure why that would be desired, but I will ask how to remove an import,
> or to "refresh" the run, of that is the appropriate question.

Hmmm. Well, yes, technically they can, but my recommendation is that you 
don't, because doing so can lead to some ... interesting ... hard-to-
debug problems.

But now that you have chosen to ignore me *wink*, you can delete the 
module so as to allow it to be refreshed like this:

del newbie00  # delete the version you are using
import sys  # only needed once 
del sys.modules['newbie00']  # delete it from the cache

Now, as far as Python is concerned, newbie00 has never been imported. 
Probably.

Another way is with the built-in reload() function:

import newbie00
  # ... make some edits to the newbie00 source file
reload(newbie00)
  # ... now the changes will show up

but be warned that there are some traps to using reload(), so much so 
that in Python 3 it has been relegated to the "imp" (short for "import") 
module, where it is less tempting to newbies.

> I think I saw someplace where a .pyc file is created on an initial run
> and subsequently run instead of the .py.  I'm not sure if that applies
> here, but if related, I guess an auxiliary question is how to easily
> force the .py to run rather than the .pyc?

No, see above: .pyc files are only created when you *import* a module, 
and they are never used when you *run* a module.

> 6.  Perhaps peripherally related to getting a running script/function/
> module name, is getting a "call listing" of all the functions (and
> modules) called by a .py program.  How would I get that?  I only ask as
> it comes in handy if one distributes a program.  I mean, you only give
> people what they actually need.

I'm not quite sure I understand what you mean. Can you explain in more 
detail?

Welcome on board with Python, I hope you have fun!

-- 
Steven