Builtin dict should be callable, since a dict defines a funct ion
Bengt Richter
bokr at oz.net
Sun Dec 22 10:56:32 EST 2002
On Sat, 21 Dec 2002 12:27:50 -0500, "Terry Reedy" <tjreedy at udel.edu> wrote:
>
>"Bengt Richter" <bokr at oz.net> wrote in message
>news:au22dd$jpi$0 at 216.39.172.122...
>> If I know that a dict will have a _default_ function-style interface
>> as if its class were defined with __call__ = __getitem__, then I can
>> make an anonymous collection of function and dict objects, e.g., a
>list
>>
>> fdmix = [f1,f2,d1,d2]
>>
>> and depend on the callable assumption for all elements.
>>
>> Now what power does this give me over the following list?
>>
>> fdmix2 = [f1,f2,d1.__getitem__, d2.__getitem__]
>>
>> Well, this list does not carry the same information. I have been
>> forced to throw away the possibility of distinguishing and acting
>> differently based on type _if_ that is useful for another use of
>> the list (e.g., passing the collection to different places).
>>
>> I admit that a subclass will do what I want, but there is a
>performance
>> hit for some reason (see below).
>
>Since I disagree that you made the proper comparison (see below), I
>disagree with the conclusion.
>
>>
>> Note that speed could make a lot of difference if you were passing a
>function
>> that processed pixels for example:
>
>For speed, pixel processing should be done in C.
I can do that, but if I am interactively experimenting with visual effects,
it can be faster to use python than set up a an automatic build
that will let me cycle experiments as fast in C. Plus Python is more
pleasant to code in ;-)
>
>> >>> import timefuns
>> >>> def noop(k): pass
>> ...
>> >>> d={'xx':123}
>> >>> class D(dict): __call__ = dict.__getitem__
>> ...
>> >>> dc = D(d)
>> >>> df = d.__getitem__
>> >>> dc.__name__ = 'D inst'
>> >>> calls = [(noop,'xx'),(dc,'xx'),(df,'xx')]
>> >>> timefuns.timefuns(100000, calls)
>> timing oh: 0.000015 ratio
>> noop: 0.000009 1.00
>> D inst: 0.000012 1.29
>> __getitem__: 0.000004 0.48
>>
>> Timing overhead is time measured by t1-t0 in
>> t0=time.clock()
>> t1=time.clock()
>> where otherwise there is a call to the function as a single line
>> in between, in thefunction timing loop.
>>
>> This overhead is subtracted out to get a better relative time
>between
>> the functions. The ratios are wrt to the first function in the list.
>>
>> Note that that d.__getitem__ is almost twice as fast as even just
>> calling a noop function with the same single parameter.
>>
>> So I think there is more than one argument in favor of this simple
>> backwards-compatible change ;-)
>
>I think there is a mis-comparison here. You are asking that dicts be
>augmented to look like your class D, so that d['a'] == d('a').
I'm not sure what you mean by 'augmented', but the point is to be
more efficient than the class D, not adopt its inefficiencies.
>However, your 'test' above compares d('a') to d.__getitem__('a'), not
>d['a'], avoiding most overhead, and finds the latter faster. So what?
>If you got the augmentation, it would run at the slower speed, due to
My idea for the change must be different from what you are thinking.
>the need to look up .__call__(). So this result does not support your
>proposal. I could even see it as an argument against.
>
The part of the timing code that does the calling is
240 SET_LINENO 41
243 LOAD_FAST 4 (clock)
246 CALL_FUNCTION 0
249 STORE_FAST 12 (t0)
252 SET_LINENO 42
255 LOAD_FAST 7 (f)
258 LOAD_FAST 3 (args)
261 CALL_FUNCTION_VAR 0
264 STORE_FAST 13 (dummy)
267 SET_LINENO 43
270 LOAD_FAST 4 (clock)
273 CALL_FUNCTION 0
276 STORE_FAST 10 (t1)
where f is bound to whatever we are testing and args is the arg tuple, I also do
a dummy store. Are you saying the C interpretation of CALL_FUNCTION_VAR
would be slower to find __call__ than __getitem__ if neither were overridden
and both pointed to the same place in C? I would think it would be hard to measure
a difference.
>What you have shown (thank you) is that the exposure of internal
>methods that came with new-style classes lets us speed up dict lookups
>by bypassing the overhead of interpreting the d[key] syntax, which
Whoa, the syntax isn't interpreted in the timing loop unless we are
compiling source every time. There seems to be more to it (see below).
>has to start with determining the type(d) (seq or dict?). Replacing
>d[key] with d.__getitem__(key) (where d__getitem__ is looked up just
>once) implicitly 'declares' the type of d within the loop by bypassing
>repeated lookup of its type.
I think that's a wrong conclusion from what I did. I just showed that
d.__getitem__ as a dict method wrapper was faster than getting the same
by dynamically looking up __call__ on the subclass instance every time.
The fair comparison would have been d.__getitem__(k) vs dc.__call__(k),
not d.__getitem__(k) vs dc(k). But d[k] is still faster than either one
for the current (2.2.2 windows) implementation, it looks liketo me.
I tried to test it:
>>> f1 = lambda d={1:2}:d[1]
>>> f2 = lambda d={1:2}.__getitem__:d(1)
>>> import dis
>>> dis.dis(f1)
0 SET_LINENO 1
3 LOAD_FAST 0 (d)
6 LOAD_CONST 1 (1)
9 BINARY_SUBSCR
10 RETURN_VALUE
>>> dis.dis(f2)
0 SET_LINENO 1
3 LOAD_FAST 0 (d)
6 LOAD_CONST 1 (1)
9 CALL_FUNCTION 1
12 RETURN_VALUE
>>> f1.func_defaults
({1: 2},)
>>> f2.func_defaults
(<method-wrapper object at 0x007DBB50>,)
>>> ^Z
[ 7:50] C:\pywk\clp>timefuns.py -L "lambda d=1:d" -L "lambda d={1:2}:d[1]" -L "lambda d={1:2}.__
getitem__:d(1)" -n 100000
timing oh: 0.000013 ratio
<lambda>: 0.000006 1.00
<lambda>: 0.000009 1.53
<lambda>: 0.000012 1.96
For reference 1.00 base, I added a lambda that has a default arg d and only returns it
instead of returning d[1] or d(1),
>>> f = lambda d=1:d
>>> import dis
>>> dis.dis(f)
0 SET_LINENO 1
3 LOAD_FAST 0 (d)
6 RETURN_VALUE
so subtracting that out should show the d[1] vs d(1) times without the lambda
call overhead. So d[1] to d(1) seems to be about 1:2 (9-6:12-6 or 53:96)
I.e., it appears that when you are dealing with a locally bound directory, d[1] is faster than
using the __getitem__ method wrapper, even if you pre-bind that to a local variable (default arg
in both cases).
It seems that BINARY_SUBSCR gets very quickly to the C PyObject_GetItem subroutine, whereas
the other route is much more involved, judging from a cursory look at ceval.c and friends.
Regards,
Bengt Richter
More information about the Python-list
mailing list