Newbie questions on import & cmd line run

Fri May 18 17:26:49 EDT 2012

On May 16, 9:33 pm, Steven D'Aprano <steve
+comp.lang.pyt... at pearwood.info> wrote:
> On Wed, 16 May 2012 18:45:39 -0700, gwhite wrote:
> > #! <what is supposed to go here?>
> > # Filename: newbie00.py
>
> "Supposed to"? Nothing -- it is completely optional.
>
> #! ("hash-bang") lines currently do nothing on Windows machines, they are
> just comments. However, on Unix and Linux machines (and Macintosh?) they
> are interpreted by the shell (equivalent to cmd.exe or command.com), in
> order to tell the shell what interpreter to use to execute the program if
> you run it directly. It is common to use something like:
>
> #!/usr/bin/env python
>
> but I stress that this is completely optional, and doesn't do anything on
> Windows.
>
> > if __name__ == '__main__':
> >     print 'This program was called from the \
> > system command line.'
> >     print __name__ + '.py'
> > else:
> >     print 'This program was imported on the \
> > Python command line.'
> >     print __name__ + '.py'
>
> > -----------------
>
> > If I run from the system (win cmd) command, I get:
>
> > C:\engineer\engruser\python>python  newbie00.py
>
> > This program was called from the system command line. __main__.py
>
> The magic variable "__name__" is special in Python. When you run a Python
> module as a script from the command line, it gets set to "__main__". Note
> that there is no such file "__main__.py" (unless you have happened to
> create one yourself).
>
> When you import a module, rather than run it, __name__ gets set to the
> actual filename of the module, minus the file extension.
>
> > -----------------
> > If I hit the run button in Sypder, I get (in the iPython command
> > console):
>
> > In [70]: runfile(r'C:\engineer\engruser\python\newbie00.py', wdir=r'C:
> > \engineer\engruser\python')
> > This program was called from the system command line. __main__.py
>
> I'm not sure what Spyder is. Is it part of iPython? You may need to
> consult the iPython or Spyder docs to find out exactly what tricks it
> plays in its interactive console.

Sypder is the IDE for pythonxy.  Since some other newbies here at my
office decided to go down the pythonxy route, I wanted to be on the
same page with them.  All of us were MATLAB users.  We're engineers,
not programmers.

> > -----------------
> > If I import on the iPython command, I get:
>
> > In [71]: import newbie00
> > This program was imported on the Python command line. newbie00.py
>
> In this case, __name__ is set to the module name (the file name less the
> file extension). Your script adds the .py at the end.
>
> > -----------------
> > If I import *again* on the iPython command, I get:
>
> > In [72]: import newbie00
>
> > In [73]:
>
> > <nothing that I can see>
>
> That is correct. Python modules are only executed *once*, the first time
> they are imported. From then on, additional imports refer back to a
> cached module object.
>
> When you say "import newbie00", the (highly simplified!) process is this:
>
> * Python looks in sys.modules for the name "newbie00". If it finds
>   something in the cache (usually a module object), it fetches that thing
>   and assigns it to the variable "newbie00", and the import process is
>   complete.
>
> * But if it doesn't find anything in the cache, Python searches the
>   locations listed in sys.path for a module, package, or library. That
>   could mean any of:
>
>   - newbie00.py   (source code)
>   - newbie00.pyc  (compiled byte-code)
>   - newbie00.pyo  (compiled optimized byte-code)
>   - newbie00.pyw  (Windows only)
>   - newbie00.dll  (Windows only C library)
>   - newbie00.so   (Linux and Unix C library)
>
>   as well as others (e.g. packages).
>
> * If no module is found, Python raises an error, otherwise the first
>   found module is used.
>
> * If a compiled module (e.g. newbie00.pyc) is found, and is no older than
>   the source code (newbie00.py), then Python uses the pre-compiled file.
>
>   (If the compiled module is older than the source module, it is ignored.)
>
> * Otherwise Python parses the newbie00.py source code, compiles it to
>   byte-code, and writes it to the file newbie00.pyc so that the next time
>   the import will be faster.
>
> * At this point, Python now has a compiled chunk of Python byte-code.
>   It then sets the special global variable __name__ to the file name (less
>   extension), and executes that code.
>
> * If no fatal error occurs, Python now bundles the results of the
>   executed code (any functions, classes, variables, etc.) into a module
>   object, stores the module object in the cache sys.modules for next time,
>   and finally assigns it to the name "newbie00".
>
> There is a *lot* that goes on the first time you import a module, which
> is why Python tries really hard to avoid running modules unless you
> explicitly ask it to. So "import newbie00" only *executes* the code once
> per Python session. Subsequent imports use the cached version.

okay.

> The process is quite different when you run a Python module as a script.
> In this case, the .pyc file (if any) is ignored, the script is parsed and
> compiled from scratch every single time, the magic variable __name__ is
> set to "__main__", and the script is executed every single time.
>
> [...]
>
> > Some questions:
>
> > 1.  If running from the system command line, or the Sypder "run" button,
> > "__name__" is "__main__" rather than "newbie00", as seen above.
>
> > So, how would I get the file name newbie00.py in these two noted cases?
>
> The special variable "__file__" is set to the filename:
>
> >>> import string
> >>> string.__file__
>
> '/usr/lib/python2.6/string.pyc'
>
> Note that __file__ will usually be set to the full path of the file. To
> extract just the file name:
>
> >>> import os
> >>> os.path.basename(string.__file__)
>
> 'string.pyc'
>
> To ignore the file extension, use this:
>
> >>> os.path.splitext(os.path.basename(string.__file__))
>
> ('string', '.pyc')
>
> [...]

Thanks.

> > 2.  In python, there seems to be a distinction between running something
> > as if it is a system command of "C:\...>python myPyFile.py" compared to
> > simply entering that same ".py" file name directly on the python console
> > command line.  In fact, the latter does not work unless the somewhat
> > lengthy ">>> runfile(r'C:\... wdir=r'C:\...) stuff is entered (in
> > iPython).  (I mean, my old MATLAB habit of simply entering ">>
> > mfilename" on the command line seems to be quite wrong in python.)
>
> That is correct. The Python interactive interpreter is not intended to be
> a full-blown shell. iPython is, so I wouldn't be surprised if there is a
> shorter version of the runfile(full pathname) stuff, but I don't know
> what it is.

Thanks for the note that iPython is a "full-blown shell."

> > Is there a shortened syntax of running a .py from the python command
> > prompt, if not using a Spyder "run" button?  Or should I always run as
> > if from the system prompt?  That is, dispense with the MATLAB-like "run
> > from MATLAB/python command line" bias I may be holding.
>
> Normally I would say "always run it from the system prompt", but that may
> be because I don't know iPython and/or Spyder.

Sure.  I actually think it is better upon some reflection.  I need to
un-MATLAB myself.

> Another alternative is to write a "main function" in your script:
>
> # Filename newbie01.py
> def main():
>     print("Doing stuff here...")
>
> if __name__ == '__main__':
>     main()
> else:
>     print("importing here")
>
> Then you can do this:
>
> >>> import newbie01
> importing here
> >>> newbie01.main()
>
> Doing stuff here...>>> newbie01.main()
>
> Doing stuff here...
>
> That would be my preference. Your mileage may vary.

Neat.  I'll try that for case's I think it makes sense for.  I'll play
with it some.

> > 3.  In injecting my old MATLAB bias of running via the command line ">>
> > mfilename", I tried a tweak of  ">>>import newbie00".  That "sort of"
> > worked, but only the first time.
>
> > Why did the subsequent run of ">>>import newbie00" print nothing?  I'm
> > just trying to understand how python works.
>
> Because modules are executed only the first time they are imported. See
> above.
>
> > 4.  The final case shown of hitting the Spyder run button included this:
>
> > UMD has deleted: newbie00
>
> > What does that mean?  I noted that after this "automatic" deletion, I
> > could do the ">>>import newbie00" once again and get the print.  (I did
> > not show that above.)
>
> That looks like a trick specific to Spyder and/or iPython.

It happens when I have previously done the import on the iPython cmd
line.

After that is cleared out, iPython does not give that message again.

It must be doing this to make sure the version in the Spyder editor
window is the one being executed, or something like that.

Incidentally, it makes no difference if I delete the .pyc file made by
the import newbie00 command before running via Spyder/iPython.  Maybe
that means the import version in compiled and in memory (or maybe what
you called "cached module object").  I just don't know.  But I think
it makes some kind of sense to clear things out.

> > 5.  I think #4 implies an import can be removed.  (Yes/No?)  I am not
> > sure why that would be desired, but I will ask how to remove an import,
> > or to "refresh" the run, of that is the appropriate question.
>
> Hmmm. Well, yes, technically they can, but my recommendation is that you
> don't, because doing so can lead to some ... interesting ... hard-to-
> debug problems.
>
> But now that you have chosen to ignore me *wink*, you can delete the
> module so as to allow it to be refreshed like this:
>
> del newbie00  # delete the version you are using
> import sys  # only needed once
> del sys.modules['newbie00']  # delete it from the cache
>
> Now, as far as Python is concerned, newbie00 has never been imported.
> Probably.
>
> Another way is with the built-in reload() function:
>
> import newbie00
>   # ... make some edits to the newbie00 source file
> reload(newbie00)
>   # ... now the changes will show up
>
> but be warned that there are some traps to using reload(), so much so
> that in Python 3 it has been relegated to the "imp" (short for "import")
> module, where it is less tempting to newbies.
>
> > I think I saw someplace where a .pyc file is created on an initial run
> > and subsequently run instead of the .py.  I'm not sure if that applies
> > here, but if related, I guess an auxiliary question is how to easily
> > force the .py to run rather than the .pyc?
>
> No, see above: .pyc files are only created when you *import* a module,
> and they are never used when you *run* a module.
>
> > 6.  Perhaps peripherally related to getting a running script/function/
> > module name, is getting a "call listing" of all the functions (and
> > modules) called by a .py program.  How would I get that?  I only ask as
> > it comes in handy if one distributes a program.  I mean, you only give
> > people what they actually need.
>
> I'm not quite sure I understand what you mean. Can you explain in more
> detail?

Oh, on reflection, I think I am MATLAB biasing again.  The rudimentary
method is that there there is an "m-file per function."  Those are
then called as inline commands.  They are not "imported."  Basically,
it might be a pain to know for sure every one of the functions you
called.  In MATLAB, if I want to send someone a "program," I also need
to know all the functions it calls outside the "standard package."  It
isn't obvious given that they don't need to be "imported."

But I think the import method with python may make things much more
obvious.  So I can relax a bit.

I was meaning like this:

http://www.mathworks.com/help/techdoc/ref/depfun.html

> Welcome on board with Python, I hope you have fun!

Thanks, Steven!  Thank you for spending your valuable time explaining
things to me.