[Python-ideas] Add dict.getkey() and set.get()
Andrew Barnert
abarnert at yahoo.com
Mon Sep 16 03:34:53 CEST 2013
From: Steven D'Aprano <steve at pearwood.info>
Sent: Sunday, September 15, 2013 5:30 PM
> On Sun, Sep 15, 2013 at 09:02:33PM +0100, Oscar Benjamin wrote:
>
>> I don't know whether this is relying on undefined behaviour but the
>> following is O(1) and seems to work:
>>
>> >>> def canonical_key(d, k):
>> ... k, = {k} & d.keys()
>> ... return k
>> ...
>
> I'm pretty sure that (1) it relies on implementation-specific behaviour,
> and (2) it's O(N), not O(1).
>
> The implementation-specific part is whether & takes the key from the
> left-hand or right-hand operand when the keys are equal. For example, in
> Python 3.3:
>
> py> {1} & {1.0}
> {1.0}
> py> {1} & {1.0, 2.0}
> {1}
Actually, this isn't strictly relevant, because he's calling dict_keys.__rand__, not set.__and__. There's no reason they have to do the same thing—and, in fact, they don't, which is why it works in the first place.
But the larger point is valid: both methods are implementation-specific—and the fact that they produce opposite results is a nice illustration of that.
> And surely it's O(N) -- to be precise, O(M+N) -- because & has to walk
> all the keys in both operands? I suppose technically & could special
> case "one of the operands has length 1" and optimize it, but that too
> would be an implementation detail.
No it doesn't. In fact, it _can't_ walk all the keys in both operands; unless they were sorted (and they aren't), there's no way to make that work. Instead, it does an O(N) or O(M) walk over one operand, calling the other operand's __contains__ method for each.
But again, the larger point is valid: whichever one it returns the values from is obviously the one it's walking. So, in his case, it's calling {k}.__contains__(n) for each key in d.keys(), which is O(N).
More information about the Python-ideas
mailing list