try/except KeyError vs "if name in ..."

Sat Oct 6 07:36:11 EDT 2012

On 10/06/2012 02:27 AM, Manuel Pégourié-Gonnard wrote:
> Hi,
>
> I was looking at the example found here [1] which begins with:
>
> [1] http://docs.python.org/py3k/library/imp.html#examples
>
> def __import__(name, globals=None, locals=None, fromlist=None):
>     # Fast path: see if the module has already been imported.
>     try:
>         return sys.modules[name]
>     except KeyError:
>         pass
>
> I was wondering if the formulation
>
>     if name in sys.modules:
>         return sys.modules[name]
>
> would be equivalent. IOW, is using try/except here only a matter of
> style or a necessity?
>
> I'm suspecting that maybe, in multithreaded environments, the second
> option may be subject to a race condition, if another thread removes
> name frome sys.modules between the if and the return, but as I'm not
> very familiar (yet) with Python threads, I'm not sure it is a real
> concern here.
>
> And maybe there are other reasons I'm completely missing for prefering
> EAFP over LBYL here?
>
> Thanks in advance for your comments.
>

Guidelines for writing library code may very well be different than for
writing your application.  And if your application is trying to do
something similar with *import*, chances are that it's calling a library
function that already starts with the test against sys.modules.  So if
this is an application question, the answer is probably "don't do either
one, just do the import, checking for the exceptions that it may throw."

The distinction in performance between the success and failure modes of
the try/catch isn't nearly as large as one of the other responses might
lead you to believe.  For example, a for loop generally terminates with
a raise (of StopIteration exception), and that doesn't convince us to
replace it with a while loop.  Besides, in this case, the except code
effectively includes the entire import, which would completely swamp the
overhead of the raise.

If we assume the question was more generally about EAFT vs. LBYL, and
not just about the case of accessing the system data structure
sys.modules, then the issues change somewhat.

If we do a LBYL, we have to know that we've covered all interesting
cases with our test.  Multithreading is one case where we can get a race
condition.  There are times when we might be able to know either that
there are not other threads, or that the other threads don't mess with
the stuff we're testing.  For example, there are enough problems with
import and threads that we might just have a development policy that (in
this program) we will do all our imports before starting any additional
threads, and that we will never try to unload an import, single threaded
or not.  But for other conditions, we might be affected either by the
system or by other processes within it.  Or even affected by other
asynchronous events over a network.

If we do an EAFP, then we have to figure out what exceptions are
possible.  However, adding more exceptions with different treatments is
quite easy, and they don't all have to be done at the same level.  Some
may be left for our caller to deal with.  I think the major thing that
people mind about try/catch is that it seems to break up the program
flow.  However, that paradigm grows on you as you get accustomed to it.

-- 

DaveA