[Python-ideas] Yet More Unpacking Generalizations (or, Dictionary Literals as lvalues)

Wed Aug 12 15:57:03 CEST 2015

Hi All,

Occasionally I find myself wanting to unpack the values of a dictionary
into local variables of a function.  This most often occurs when
marshalling values to/from some serialization format.

For example:

def do_stuff_from_json(json_dict):
    actual_dict = json.loads(json_dict)
    foo = actual_dict['foo']
    bar = actual_dict['bar']
    # Do stuff with foo and bar.

In the same spirit as allowing argument unpacking into tuples or lists,
what I'd really like to be able write is something like:

def do_stuff_from_json(json_dict):
    # Assigns variables in the **values** of the lefthand side by doing
lookups
    # of the corresponding keys in the result of the righthand side
expression.
    {'foo': foo, 'bar': bar} = json.loads(json_dict)

Nearly all the arguments in favor of tuple/list unpacking also apply to
this construct.  In particular:

1. It makes the code more self-documenting, in that the left side of the
expression looks more like the expected output of the right side.
2. The construct can be implemented more efficiently by the interpreter by
using a dictionary analog of the UNPACK_SEQUENCE opcode (e.g. UNPACK_MAP).

An interesting question that falls out of this idea is whether/how we
should handle nested structures. I'd expect the rule to be that something
like:

{'toplevel': {'key1': key1, 'key2': key2}} = value

would desugar into something equivalent to:

TEMP = value['toplevel']
key1 = TEMP['key1']
key2 = TEMP['key2']
del TEMP

while something like

{'toplevel': (x, y)} = value

would desugar into something like:

(x, y) = value['toplevel']

At the bytecode level, I'd expect this to be implemented with a new
instruction, analogous to the current UNPACK_SEQUENCE, which would pop N
keys and a map from the stack, and push map[key] onto the stack for each
popped key.  We'd then recurse through the values left on the stack,
storing them as we would store the sub-lvalues if they were in a
standard assignment.  Thus the code for something like:

{'name': name, 'tuple': (x, y), 'dict': {'subkey': subvalue}} = values

would translate into the following "pseudo-bytecode":

LOAD_NAME 'values'  # Push rvalue onto the stack.
LOAD_CONST 'dict'   # Push top-level keys onto the stack.
LOAD_CONST 'tuple'
LOAD_CONST 'name'
UNPACK_MAP 3        # Unpack keys. Pops values and all keys from the stack.
                    # TOS  = values['name']
                    # TOS1 = values['tuple']
                    # TOS2 = values['dict']

STORE_FAST name     # Terminal names are simply stored.

UNPACK_SEQUENCE 2   # Push the two entries in values['tuple'] onto the
stack.
                    # TOS  = values['tuple'][0]
                    # TOS1 = values['tuple'][1]
                    # TOS2 = values['dict']
STORE_FAST x
STORE_FAST y

LOAD_CONST 'subkey' # TOS  = 'subkey'
                    # TOS1 = values['dict']

UNPACK_MAP 1        # TOS = values['dict']['subkey']
STORE_FAST subvalue

I'd be curious to hear others' thoughts on whether this seems like a
reasonable idea.  One open question is whether non-literals should be
allowed as keys in dictionaries (the above still works as expected if the
keys are allowed to be names or expressions; the LOAD_CONSTs would turn
into whatever expression or LOAD_* is necessary to put the necessary value
on the stack). Another question is if/how we should handle extra keys in
right-hand side of the assignment (my guess is that we shouldn't do
anything special with that case).

-Scott

P.S. I attempted to post this last night, but it seems to have not gone
through.  Apologies for the double post if I'm mistaken about that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150812/b245a786/attachment-0001.html>