[Python-Dev] PEP 3103: A Switch/Case Statement

Nick Coghlan ncoghlan at gmail.com
Wed Jun 28 09:56:45 CEST 2006


Guido van Rossum wrote:
> I think we all agree
> that side effects of case expressions is one way how we can deduce the
> compiler's behind-the-scenes tricks (even School Ib is okay with
> this). So I don't accept this as proof that Option 2 is better.

OK, I worked out a side effect free example of why I don't like option 3:

   def outer(cases=None):
       def inner(option, force_default=False):
           if cases is not None and not force_default:
               switch option:
                   case in cases[0]:
                       # case 0 handling
                   case in cases[1]:
                       # case 1 handling
                   case in cases[2]:
                       # case 2 handling
           # Default handling
       return inner

I believe it's reasonable to expect this to work fine - the case expressions 
don't refer to any local variables, and the subscript operations on the 
closure variable are protected by a sanity check to ensure that variable isn't 
None.

There certainly isn't anything in the code above to suggest to a reader that 
the condition attempting to guard evaluation of the switch statement might not 
do its job.

With first-time-execution jump table evaluation, there's no problem - when the 
closure variable is None, there's no way to enter the body of the if
statement, so the switch statement is never executed and the case expressions
are never evaluated. Such functions will still be storing a cell object for
the switch's jump table, but it will always be empty because the code to
populate it never gets a chance to run.

With the out of order execution involved in def-time evaluation, however, the
case expressions would always be executed, even though the inner function is 
trying to protect them with a sanity check on the value of the closure variable.

Using Option 3 semantics would mean that calling "outer()" given the above 
function definition will give you the rather surprising result "TypeError: 
'NoneType' object is unsubscriptable", with a traceback pointing to the line 
"case cases[0]:" in the body of a function that hasn't been called, and that 
includes an if statement preventing that line from being reached when 'cases' 
is None.

>> When it comes to the question of "where do we store the result?" for the
>> first-execution calculation of the jump table, my proposal is "a 
>> hidden cell
>> in the current namespace".
> 
> Um, what do you mean by the current namespace? You can't mean the
> locals of the function containing the switch. There aren't always
> outer functions so I must conclude you mean the module globals. But
> I've never seen those referred to as "the current namespace".

By 'current namespace' I really do mean locals() - the cell objects themselves
would be local variables from the point of view of the currently executing code.

For functions, the cell objects would be created at function definition time,
for code handled via exec-style execution, they'd be created just before 
execution of the first statement begins. In either case, the cell objects 
would already be in locals() before any bytecode gets executed.

It's only the calculation of the cell *contents* that gets deferred until
first execution of the switch statement.

> So do I understand that the switch gets re-initialized whenever a new
> function object is created? That seems a violation of the "first time
> executed" rule, or at least a modification ("first time executed per
> defined function"). Or am I misunderstanding?

I took it as a given that 'first time execution' had to be per function
and/or invocation of exec - tying caching of expressions that rely on module
globals or closure variables to code objects doesn't make any sense, because
the code object may have different globals and/or closure variables next time
it gets executed.

I may not have explained my opinion about that very well though, because the 
alternative didn't even seem to be an option.

> But if I have a code object c containing a switch statement (not
> inside a def) with a side effect in one of its cases, the side effect
> is activated each time through the following loop, IIUC:
> 
>  d = {}
>  for i in range(10):
>    exec c in d

Yep. For module and class level code, the caching really only has any
speed benefit if the switch statement is inside a loop.

The rationale for doing it that way becomes clearer if you consider what would 
happen if you created a new dictionary each time through the loop:

   for i in range(10):
       d = {}
       exec c in d
       print d["result"]

> I'm confused how you can first argue that tying things to the function
> definition is one of the main drawbacks of Option 3, and then proceed
> to tie Option 2 to the function definition as well. This sounds like
> by far the most convoluted specification I have seen so far. I hope
> I'm misunderstanding what you mean by namespace.

It's not the link to function definitions that I object to in Option 3, it's
the idea of evaluating the cases at function definition *time*. I believe the
out-of-order execution involved will result in too many surprises when you
start considering surrounding control flow statements that lead to the switch 
statement not being executed at all.

If a switch statement is inside a class statement, a function definition
statement, or an exec statement then I still expect the jump table to be
recalculated every time the containing statement is executed, regardless of
whether Option 2 or Option 3 is used for when the cases expressions get
evaluated (similarly, reloading a module would recalculate any module level 
jump tables)

And I agree my suggestions are the most involved so far, but I think that's 
because the current description of option 3 is hand-waving away a couple of 
important issues:
   - how does it deal with module and class level code?
   - how does it deal with switch statements that are inside conditional logic
where that conditional logic determines whether or not the case
expressions can be safely evaluated?

(I guess the fact that I'm refining the idea while writing about it doesn't 
really help, either. . .)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org


More information about the Python-Dev mailing list