[Python-Dev] Arbitrary non-identifier string keys when using **kwargs

Serhiy Storchaka storchaka at gmail.com
Wed Oct 10 02:48:42 EDT 2018


10.10.18 05:12, Benjamin Peterson пише:
> On Tue, Oct 9, 2018, at 17:14, Barry Warsaw wrote:
>> On Oct 9, 2018, at 16:21, Steven D'Aprano <steve at pearwood.info> wrote:
>>>
>>> On Tue, Oct 09, 2018 at 10:26:50AM -0700, Guido van Rossum wrote:
>>>> My feeling is that limiting it to strings is fine, but checking those
>>>> strings for resembling identifiers is pointless and wasteful.
>>>
>>> Sure. The question is, do we have to support uses where people
>>> intentionally smuggle non-identifier strings as keys via **kwargs?
>>
>> I would not be in favor of that.  I think it doesn’t make sense to be
>> able to smuggle those in via **kwargs when it’s not supported by
>> Python’s grammar/syntax.
> 
> Can anyone think of a situation where it would be advantageous for an implementation to reject non-identifier string kwargs? I can't.

I can. The space of identifiers is smaller than the space of all 
strings. We need just 6 bits per character for ASCII identifiers and 16 
bits per character for Unicode identifiers. We could use a special kind 
of strings for more compact representation of identifiers. It may be 
even possible to encode all identifiers used in the stdlib and in the 
program as a tagged 64-bit pointer. Currently dict has specialized code 
for string keys, it could have specialization for identifiers (used only 
for keyword arguments, instance dicts, etc). Argument parsing code can 
also utilize the fact that a special hash for short identifiers doesn't 
have collizions and compare just hashes.

All this looks fantastic, but I would not close doors for future 
optimizations.



More information about the Python-Dev mailing list