[Python-ideas] Preserving **kwargs order (was: Re: OrderedDict literals)

Tue Apr 15 20:56:47 CEST 2014

On Apr 14, 2014, at 21:12, "Franklin? Lee" <leewangzhong+python at gmail.com> wrote:

> > Also, I think you've got packing and unpacking backward. There's no way **unpack can be smart about the type of thing being unpacked, because it explicitly allows for any kind of mapping, supplied by the user, and unpacks its keys as if they were keyword arguments.
> 
> No, I mean that the interpreter can inspect the type of the kwargs during unpacking. Primarily, it can pass along the same type (I know it shouldn't be the same *object*).

So if the object is some hypothetical
FrozenDict type like you mentioned below, it should create an instance of that type and then copy the values into it? Or what if someone unpacks a dbm? 

For that matter, I've got a custom mapping object that maps a handful of predefined keys to true or false by storing a bitmap; I'm not sure whether I've unpacked this into **kwargs, but it makes sense, it works today, and would work with the OrderedDict proposal, but obviously wouldn't work if you tried to create the same type for kwargs.

More importantly, I think you're still missing the point that keyword unpacking and extra-keyword parameters are not directly connected. A **kwargs parameter is not generally used to accept a **mapping argument pack from the caller; if you want that, you just pass the dict as a normal argument. A **kwargs parameter is for collecting extra keyword arguments, and a **mapping is for supplying keyword arguments from an existing mapping. In the special case that both exist, there are no keyword arguments that don't match normal parameters, and there are no keys in the unpacked dict that do match normal parameter. In any other case, they have different contents.

Look at the examples I gave before. For example, how is this going to preserve keyword order in cases where nobody unpacks a mapping along with the keywords? There is no object to create a new object of the same type.

Or, putting both problems together:

    def spam(**kw):
        pass
    d = HypotheticalFrozenDict(b=1)
    spam(a=0, **d)

What is kw going to receive?

Also, even if this worked, you're putting the burden of ordering the keywords on the caller, rather than on the function that actually cares about the keyword order. That's the wrong place to put it. You've solved the case of forwarding delegate methods, partial, etc., but not the everyday case.

> It's also possible that, in the future, it can pass a *view* of the kwargs dictionary *if* the callee would receive the exact same keys (or maybe even more generally), and then only do a deeper copy of the item pointers and order array if either the view or the original is changed.

How would this work?

Once again, what you receive in your extra keywords parameter is a mapping that contains all the keyword arguments that weren't matched to normal parameters and all the items of the unpacked mapping (if any) that weren't matched, but none of the keywords or mapping keys that were matched. A view into the unpacked mapping doesn't help unless you can somehow shadow it to hide some keys (the ones that matched parameters) and add others (the unmatched keywords). In fact, to get the ordered mapping we're looking for, you have to be able to add them _before_ the unpacked keys, but still preserving their own order.

> > Anyway, even if things did work this way, one of Guido's objections is that he writes lots of code that keeps the kwargs dict around and adds to it later. (Presumably he has a method that takes **kw and stores it in an instance attribute, although he didn't give examples.) Using something which is not a dict, but only fakes it just well enough for passing keyword arguments, surely isn't going to satisfy him.
> 
> Well, what exactly do you mean "not a dict"? It won't be a literal dict type, no, but it might be a subclass. It would be better if the InternedDict could be as fast as possible (for the usual case), but I don't think that it would absolutely cripple the idea to make it Mutable 

If it's just a subclass without actually being a subtype (meaning it can do anything a dict can do, including adding arbitrary keys), then all this does is trick Guido's code and prevent it from even having a way to test or document its preconditions. (This is what the Liskov Substitution Principle is all about--and it's even more important in duck-typed languages than static-typed, because isinstance in Python is used only to check subtitling, unlike dynamic_cast in C++.)

And if it actually is a subtype, I don't see how you're going to accelerate it; it it can assign to keys of any type, then it can't assume all its keys are strings.

> ... A-and on the flip side, I put forth that maybe Guido shouldn't be doing that kind of thing in the first place (*gasp*)! If he wanted a copy, he should've explicitly copied it with dict(), instead of relying on behavior that I've never even seen a doc reference for!

I actually agree here. This is my argument against the OrderedDict kwargs PEP having a decorator that gives you a dict instead of an OrderedDict: it's just as easy, and clearer and more explicit, for Guido to save dict(kwargs) as for him to add @unordered_kwargs to the function.

> (Not sure if I actually believe that **kwargs should be some subclass of FrozenDict, but maybe someone else does, and they can convince us.)
> 
> > For example, there's plenty of Python 2.x code that fakes keyword-only arguments by calling kwargs.pop; that code still runs in Python 3.4, and I don't think you want to break it in 3.5. So you're effectively going to have to build a full MutableMapping.
> 
> So... you're saying there's a chance we can break it in 3.6, right? No reason not to look to the future if backwards compatibility is what's keeping it from happening now.

Well, I think it would have to go through the normal 3-version deprecation cycle, not just be dropped in to the next version.

But, more importantly, a change that breaks backward compatibility takes a much more compelling argument than one that doesn't.

It looks like everyone believes that the simpler OrderedDict proposal (using the already-planned C implementation of OrderedDict and changing nothing else about the code except to copy an unpacked mapping items in iteration order after the keywords rather than before) will probably be fast enough and small enough except in certain edge cases. A more complicated alternative proposal that's also probably fast enough and small enough except in those edge cases, but breaks those edge cases and others as well in a backward-incompatible way, doesn't seem like a win.

> > You really think so?
> 
> Yes, I believe that it is not impossible that someone knows how to make pointer lookups faster than general hash lookups (a function call) of general dicts. This may be because I haven't looked at the dict implementation, or because the dict implementation could be better (see https://mail.python.org/pipermail/python-dev/2012-December/123028.html), but I'm putting forth the idea because maybe it's not impossible, and someone can make it possible.

The post you're referring to doesn't make lookup any faster; it makes most dicts smaller, and it makes iteration faster, but it has no effect on lookup. Also, notice that it explicitly says it has no effect on the existing optimization for lookup in string-only dicts. So, if you're hoping that it could somehow make string-only dicts faster, you need to read it again.

> > But, if it is, hashing the strings to put them into an interned string table, and then hashing those interned pointers to put into the mini-dict, isn't going to be faster.
> 
> Why not? If an InternedDict is passing args to a newly-created IDict, the hashing could be a single instruction: modulo by 63 (or something). (And then maybe multiple instructions to crawl through a linked list.) And knowing you don't HAVE to call a str.hash is surely better than calling even a cached str.hash, no?

But you have to hash the keywords in the first place, or do something equivalent, to intern them.

Unless you intend to intern another copy of the same strings on every function call?

Also, you have to keep in mind that, on top of any lookups the function body does, the function call machinery also has to effectively look up every parameter and (to catch duplicate-keyword errors) every keyword argument in the unpacked mapping. If those lookups weren't fast enough, then your change doesn't help. If they were, then your change is probably unnecessary.

> Point is, again, I don't know that it won't be faster. You seem to know that it won't be faster, but I don't see you saying that there CAN'T be such cleverness when kwargs are nice and forwarded nicely, even with deletions or additions allowed.

Again, kwargs parameters aren't forwarded from anything, they're built from the unmatched keyword arguments, both normal and unpacked. And unpacking arguments aren't forwarded to anything, they're unpacked into keyword arguments, some of which may be gathered into a kwargs parameter.

If you're only trying to optimize the case of forwarding for delegation, then your idea makes more sense, but I don't think that's the case anyone is looking to optimize.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20140415/b8e853d6/attachment.html>