[Python-ideas] Preserving **kwargs order (was: Re: OrderedDict literals)

Tue Apr 15 23:32:20 CEST 2014

>
> but obviously wouldn't work if you tried to create the same type for
> kwargs.
>

Sorry, I said that completely wrong. I mean that if it's already a special
dict (one that can be optimized, for some definition of "can"), the
interpreter can use a special algorithm for copying from such a dict.
Otherwise, it'd just have to do regular iteration.

More importantly, I think you're still missing the point that keyword
> unpacking and extra-keyword parameters are not directly connected.
>

Rather than missing the point, I don't see any reason why they shouldn't be
made connected.

For example, how is this going to preserve keyword order in cases where
> nobody unpacks a mapping along with the keywords?
>

(and)

> Again, kwargs parameters aren't forwarded from anything, they're built
> from the unmatched keyword arguments, both normal and unpacked.
>

An algorithm-ish: At callpoint, the caller would put explicit keywords into
a new kwdict, then look for **somethingToUnpack. If it finds it, and it's a
special dict, it packs "smartly". Otherwise, it iterates through the pack
and adds it to the kwdict.

The callee gets the kwdict and pops the entries that it makes explicit
(allowing deletion isn't the biggest problem). The leftover kwdict is named
kwargs or whatever.

How would this work?
>

"it can pass a *view* of the kwargs dictionary *if* the callee would
receive the exact same keys"

Your counter is that it wouldn't work if the callee doesn't receive the
exact same keys. In that case, it would do the more general thing I stated
above, whereby it creates a new dict like it does currently, but in a
"smarter" way. It's a separate idea.

then all this does is trick Guido's code and prevent it from even having a
> way to test or document its preconditions.

I didn't understand this. What preconditions?

And if it actually is a subtype, I don't see how you're going to accelerate
> it; it it can assign to keys of any type, then it can't assume all its keys
> are strings.
>

What I suggested was a second array of k-v pairs for newly-added items,
still in order, but with more general keys. If done with ALL new keys, this
might slow down lookup due to looking up the ID (in case it's interned) and
then hashing and looking up the string itself. But maybe this kind of case
(modifying incoming kwargs) is uncommon enough (as in, doesn't happen much
relative to the general case) that the overall program is faster. I don't
know. Again, I'm putting forth the idea for someone who knows better about
the internals to say for sure.

Also, even if this worked, you're putting the burden of ordering the
> keywords on the caller, rather than on the function that actually cares
> about the keyword order. That's the wrong place to put it. You've solved
> the case of forwarding delegate methods, partial, etc., but not the
> everyday case.
>

The everyday case only cares if the actual performance goes down. I don't
see a philosophical reason to ban the possibility of caller-burden
altogether. That's why I'm asking the question: if the ** mechanic is
clever by using the restraints about what is legal to pack/unpack, would it
improve performance to the point that order can be thrown in and overall
performance is STILL better than the status quo?

A more complicated alternative proposal that's also probably fast enough
> and small enough except in those edge cases, but breaks those edge cases
> and others as well in a backward-incompatible way, doesn't seem like a win.
>

It may be confusing, but I'm proposing multiple exclusive possibilities.
- Break those edge cases, because I'm not convinced they should be allowed.
- Don't break those edge cases, but it's okay to slow them down because
everything else will be faster and make up for them.

Breaking the edge cases isn't integral, but it might be desirable for the
language, far off into the future.

And I'm not so much fighting for this to happen (instead of the C
OrderedDict) as trying to flesh out the idea, because I don't *know enough*
to say that it *won't* be better.

The post you're referring to doesn't make lookup any faster; it makes most
> dicts smaller, and it makes iteration faster, but it has no effect on
> lookup. Also, notice that it explicitly says it has no effect on the
> existing optimization for lookup in string-only dicts. So, if you're hoping
> that it could somehow make string-only dicts faster, you need to read it
> again.
>

No. That line you quoted was about why my proposal WOULDN'T help. What I
said was that I thought it could be better than the current use of dict,
and it's possible that it only has that potential because dict isn't as
good as it can be. In other words, maybe it CAN be better than the current
implementation with dict, but instead of putting in the effort to make
this, put the effort into making dict better.

Note, though, that
1. Raymond Hettinger's dict changes WOULD make it ordered, giving us
ordered kwargs as a byproduct.
2. It makes iteration faster, which, as far as I know, should make kwargs
packing and unpacking faster.

But rereading it, I see that it means dicts already have a str-only
optimization. Rather than, "This dict can't be used for your suggested
kwargs optimization," it's, "The current dict is probably already doing
what you're suggesting, so it wouldn't be much of a performance
improvement."

But you have to hash the keywords in the first place, or do something
> equivalent, to intern them.
>

Yes. It'd be hashed once per keyword per function, and once per explicit
keyword in a call (so if a line calls multiple times, it would have to hash
that, I suppose, until the interpreter gets smarter, or if it's already
smart about it), and once per keyword in an expansion from a regular dict.

Also, you have to keep in mind that, on top of any lookups the function
> body does, the function call machinery also has to effectively look up
> every parameter and (to catch duplicate-keyword errors) every keyword
> argument in the unpacked mapping. If those lookups weren't fast enough,
> then your change doesn't help. If they were, then your change is probably
> unnecessary.
>

I don't see the relevance of needing to look up every parameter. What is
the objection?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20140415/d05c454f/attachment-0001.html>