[Python-Dev] Simple Switch statement

Raymond Hettinger raymond.hettinger at verizon.net
Sun Jun 25 00:49:20 CEST 2006


>From what I can see, almost everyone wants a switch statement, though perhaps 
for different reasons.

The main points of contention are 1) a non-ambiguous syntax for assigning 
multiple cases to a single block of code, 2) how to compile variables as 
constants in a case statement, and 3) handling overlapping cases.

Here's a simple approach that will provide most of the benefit without trying to 
overdo it:


    switch f(x):          # any expression is allowable here but raises an 
exception if the result is not hashable
    case 1: g()           # matches when f(x)==1
    case 2,3 : h()        # matches when f(x) in (2,3)
    case 1: i()           # won't ever match because the first case 1 wins
    case (4,5), 6: j()    # matches when f(x) in ((4,5), 6)
    case "bingo": k()     # matches when f(x) in ("bingo",)
    default:   l()        # matches if nothing else does

Though implemented as a hash table, this would execute as if written:

    fx = f(x)
    hash(fx)
    if fx in (1,):
        g()
    elif fx in (2,3):
        h()
    elif fx in (1,):
        i()
    elif fx in ((4,5), 6):
        j()
    elif fx in ("bingo",):
        k()
    else:
        l()

The result of f(x) should be hashable or an exception is raised.
Cases values must be ints, strings, or tuples of ints or strings.
No expressions are allowed in cases.
Since a hash table is used, the fx value must support __hash__ and __eq__,
but not expect multiple __eq__ tests as in the elif version.

I've bypassed the constantification issue.  The comes-up throughout Python
and is not unique to the switch statement.  If someone wants a "static" or
"const" declaration, it should be evaluated separately on its own merits.

At first, I was bothered by not supporting sre style use cases with imported
codes; however, I noticed that sre's imported constants already have values that
correspond to their variable names and that that commonplace approach
makes is easy to write fast switch-case suites:

    def _compile(code, pattern, flags):
        # internal: compile a (sub)pattern
        for op, av in pattern:
            switch op:
                case 'literal', 'not_literal':
                    if flags & SRE_FLAG_IGNORECASE:
                        emit(OPCODES[OP_IGNORE[op]])
                        emit(_sre.getlower(av, flags))
                    else:
                        emit(OPCODES[op])
                        emit(av)
                elif 'in':
                    if flags & SRE_FLAG_IGNORECASE:
                        emit(OPCODES[OP_IGNORE[op]])
                        def fixup(literal, flags=flags):
                            return _sre.getlower(literal, flags)
                    else:
                        emit(OPCODES[op])
                        fixup = _identityfunction
                    skip = _len(code); emit(0)
                    _compile_charset(av, flags, code, fixup)
                    code[skip] = _len(code) - skip
                case 'any':
                    if flags & SRE_FLAG_DOTALL:
                        emit(OPCODES[ANY_ALL])
                    else:
                        emit(OPCODES[ANY])
                case 'repeat', 'min_repeat', 'max_repeat':
                    . . .

When the constants are mapped to integers instead of strings, it is no
burden to supply a reverse mapping like we already do in opcode.py.
This commonplace setup also makes it easy to write fast switch-case suites:

    from opcode import opmap

    def calc_jump_statistics(f):
        reljumps = absjumps = 0
        for opcode, oparg in gencodes(f.func_code.co_code):
            switch opmap[opcode]:
                case 'JUMP_FORWARD', 'JUMP_IF_FALSE', 'JUMP_IF_TRUE':
                    reljumps +=1
                case 'JUMP_ABSOLUTE', 'CONTINUE_LOOP':
                    absjumps += 1
                  . . .

So, that is it, my proposal for simple switch statements with a straight-forward
implementation, fast execution, simply explained behavior, and applicability to
to the most important use cases.



Raymond


P.S. For the sre case, we get a great benefit from using strings.  Since they 
are
all interned at compile time and have their hash values computed no more than
once, the dispatch table will never have to actually calculate a hash and the
full string comparison will be bypassed because "identity implies equality".
That's nice.  The code will execute clean and fast.  AND we get readability
improvements too.  Not bad.



More information about the Python-Dev mailing list