Builtin dict should be callable, since a dict defines a funct ion

Bengt Richter bokr at oz.net
Sun Dec 22 10:56:32 EST 2002


On Sat, 21 Dec 2002 12:27:50 -0500, "Terry Reedy" <tjreedy at udel.edu> wrote:

>
>"Bengt Richter" <bokr at oz.net> wrote in message
>news:au22dd$jpi$0 at 216.39.172.122...
>> If I know that a dict will have a _default_ function-style interface
>> as if its class were defined with __call__ = __getitem__, then I can
>> make an anonymous collection of function and dict objects, e.g., a
>list
>>
>>   fdmix = [f1,f2,d1,d2]
>>
>> and depend on the callable assumption for all elements.
>>
>> Now what power does this give me over the following list?
>>
>>   fdmix2 = [f1,f2,d1.__getitem__, d2.__getitem__]
>>
>> Well, this list does not carry the same information. I have been
>> forced to throw away the possibility of distinguishing and acting
>> differently based on type _if_ that is useful for another use of
>> the list (e.g., passing the collection to different places).
>>
>> I admit that a subclass will do what I want, but there is a
>performance
>> hit for some reason (see below).
>
>Since I disagree that you made the proper comparison (see below), I
>disagree with the conclusion.
>
>>
>> Note that speed could make a lot of difference if you were passing a
>function
>> that processed pixels for example:
>
>For speed, pixel processing should be done in C.
I can do that, but if I am interactively experimenting with visual effects,
it can be faster to use python than set up a an automatic build
that will let me cycle experiments as fast in C. Plus Python is more
pleasant to code in ;-)
>
>>  >>> import timefuns
>>  >>> def noop(k): pass
>>  ...
>>  >>> d={'xx':123}
>>  >>> class D(dict): __call__ = dict.__getitem__
>>  ...
>>  >>> dc = D(d)
>>  >>> df = d.__getitem__
>>  >>> dc.__name__ = 'D inst'
>>  >>> calls = [(noop,'xx'),(dc,'xx'),(df,'xx')]
>>  >>> timefuns.timefuns(100000, calls)
>>             timing oh:  0.000015  ratio
>>                  noop:  0.000009   1.00
>>                D inst:  0.000012   1.29
>>           __getitem__:  0.000004   0.48
>>
>> Timing overhead is time measured by t1-t0 in
>>    t0=time.clock()
>>    t1=time.clock()
>> where otherwise there is a call to the function as a single line
>> in between, in thefunction timing loop.
>>
>> This overhead is subtracted out to get a better relative time
>between
>> the functions. The ratios are wrt to the first function in the list.
>>
>> Note that that d.__getitem__ is almost twice as fast as even just
>> calling a noop function with the same single parameter.
>>
>> So I think there is more than one argument in favor of this simple
>> backwards-compatible change ;-)
>
>I think there is a mis-comparison here.  You are asking that dicts be
>augmented to look like your class D, so that d['a'] == d('a').
I'm not sure what you mean by 'augmented', but the point is to be
more efficient than the class D, not adopt its inefficiencies.

>However, your 'test' above compares d('a') to d.__getitem__('a'), not
>d['a'], avoiding most overhead, and finds the latter faster.  So what?
>If you got the augmentation, it would run at the slower speed, due to
My idea for the change must be different from what you are thinking.

>the need to look up .__call__().  So this result does not support your
>proposal.  I could even see it as an argument against.
>
The part of the timing code that does the calling is

        240 SET_LINENO              41
        243 LOAD_FAST                4 (clock)
        246 CALL_FUNCTION            0
        249 STORE_FAST              12 (t0)

        252 SET_LINENO              42
        255 LOAD_FAST                7 (f)
        258 LOAD_FAST                3 (args)
        261 CALL_FUNCTION_VAR        0
        264 STORE_FAST              13 (dummy)

        267 SET_LINENO              43
        270 LOAD_FAST                4 (clock)
        273 CALL_FUNCTION            0
        276 STORE_FAST              10 (t1)

where f is bound to whatever we are testing and args is the arg tuple, I also do
a dummy store. Are you saying the C interpretation of CALL_FUNCTION_VAR 
would be slower to find __call__ than __getitem__ if neither were overridden
and both pointed to the same place in C? I would think it would be hard to measure
a difference.

>What you have shown (thank you) is that the exposure of internal
>methods that came with new-style classes lets us speed up dict lookups
>by bypassing the overhead of interpreting  the d[key] syntax, which
Whoa, the syntax isn't interpreted in the timing loop unless we are
compiling source every time. There seems to be more to it (see below).

>has to start with determining the type(d) (seq or dict?).  Replacing
>d[key] with d.__getitem__(key) (where d__getitem__ is looked up just
>once) implicitly 'declares' the type of d within the loop by bypassing
>repeated lookup of its type.

I think that's a wrong conclusion from what I did. I just showed that
d.__getitem__ as a dict method wrapper was faster than getting the same
by dynamically looking up __call__ on the subclass instance every time.

The fair comparison would have been d.__getitem__(k) vs dc.__call__(k),
not d.__getitem__(k) vs dc(k). But d[k] is still faster than either one
for the current (2.2.2 windows) implementation, it looks liketo me.
I tried to test it:

 >>> f1 = lambda d={1:2}:d[1]
 >>> f2 = lambda d={1:2}.__getitem__:d(1)
 >>> import dis
 >>> dis.dis(f1)
           0 SET_LINENO               1
           3 LOAD_FAST                0 (d)
           6 LOAD_CONST               1 (1)
           9 BINARY_SUBSCR
          10 RETURN_VALUE
 >>> dis.dis(f2)
           0 SET_LINENO               1
           3 LOAD_FAST                0 (d)
           6 LOAD_CONST               1 (1)
           9 CALL_FUNCTION            1
          12 RETURN_VALUE
 >>> f1.func_defaults
 ({1: 2},)
 >>> f2.func_defaults
 (<method-wrapper object at 0x007DBB50>,)
 >>> ^Z

 [ 7:50] C:\pywk\clp>timefuns.py -L "lambda d=1:d" -L "lambda d={1:2}:d[1]" -L "lambda d={1:2}.__
 getitem__:d(1)" -n 100000
            timing oh:  0.000013  ratio
             <lambda>:  0.000006   1.00
             <lambda>:  0.000009   1.53
             <lambda>:  0.000012   1.96

For reference 1.00 base, I added a lambda that has a default arg d and only returns it
instead of returning d[1] or d(1),

 >>> f = lambda d=1:d
 >>> import dis
 >>> dis.dis(f)
           0 SET_LINENO               1
           3 LOAD_FAST                0 (d)
           6 RETURN_VALUE

so subtracting that out should show the d[1] vs d(1) times without the lambda
call overhead. So d[1] to d(1) seems to be about 1:2 (9-6:12-6 or 53:96)

I.e., it appears that when you are dealing with a locally bound directory, d[1] is faster than
using the __getitem__ method wrapper, even if you pre-bind that to a local variable (default arg
in both cases).

It seems that BINARY_SUBSCR gets very quickly to the C PyObject_GetItem subroutine, whereas
the other route is much more involved, judging from a cursory look at ceval.c and friends.

Regards,
Bengt Richter



More information about the Python-list mailing list