[Python-ideas] Add dict.getkey() and set.get()

Oscar Benjamin oscar.j.benjamin at gmail.com
Mon Sep 16 11:16:10 CEST 2013


On 16 September 2013 08:25, Andrew Barnert <abarnert at yahoo.com> wrote:
> On Sep 15, 2013, at 18:39, Terry Reedy <tjreedy at udel.edu> wrote:
>
>> On 9/15/2013 8:30 PM, Steven D'Aprano wrote:
>>> On Sun, Sep 15, 2013 at 09:02:33PM +0100, Oscar Benjamin wrote:
>>>
>>>> I don't know whether this is relying on undefined behaviour but the
>>>> following is O(1) and seems to work:
>>>>
>>>>>>> def canonical_key(d, k):
>>>> ...     k, = {k} & d.keys()
>>>> ...     return k
>>>> ...
>>>
>>> I'm pretty sure that (1) it relies on implementation-specific behaviour,
>>> and (2) it's O(N), not O(1).
>>>
>>> The implementation-specific part is whether & takes the key from the
>>> left-hand or right-hand operand when the keys are equal. For example, in
>>> Python 3.3:
>>>
>>> py> {1} & {1.0}
>>> {1.0}
>>> py> {1} & {1.0, 2.0}
>>> {1}
>>>
>>> And surely it's O(N) -- to be precise, O(M+N) -- because & has to walk
>>> all the keys in both operands?
>>
>> No, just the keys in one of the operands. That can be chosen to be the smaller of the two, making it O(min(M,N)). I believe that is what is happening above: the operands are switched when the first is smaller. I know there was a tracker issue that added this optimization.
>
> But it doesn't happen for dict_keys, because that ends up calling intersection_update on a copy of the left operand, which means it always walks the right operand (in this case the dict_keys itself), instead of calling set_intersection, as it would on two sets.
>
> At any rate, whenever this does what's desired, it does a linear walk of all keys in the dict; conversely, any variant that only walks the single key in {k} ends up returning {k} instead of what you were looking for.

Ah, right you are. That's unfortunate since it means that set
intersection semantics are determined by an optimisation and dict_key
intersection lacks an obvious optimisation (to iterate over the
smaller set).

Note that the optimisation is not incompatible with having a defined
"take keys from the left (or from the right)" semantic since the
set_contains_entry function that performs the lookup can access the
matching key from the same lookup used for the containment test:
http://hg.python.org/cpython/file/95b3efe3d7b7/Objects/setobject.c#l689


Oscar


More information about the Python-ideas mailing list