[Types-sig] Low-hanging fruit: recognizing builtins
Guido van Rossum
guido@CNRI.Reston.VA.US
Thu, 16 Dec 1999 10:44:04 -0500
[MAL]
> In the long run it would be better to wrap those module
> globals with write access functions (the write action would
> then be recognized by the optimizer).
Yes, but we need to deal with the current idiom or we'd break too much
code. (When I have to break some valid code, I'd rather do it in an
explicit way, e.g. by adding a keyword, rather than silently changing
working code into non-working code for an obscure reason.)
> I haven't followed the thread too closely, but isn't there
> some way to tell the optimizer which modules to treat at
> what optimization level ? Old modules should only use the
> "safe" caching strategy then while modules compiled with
> full optimization would be caching all read-only globals.
That hasn't been discussed this time around. I think you have
proposed more optimization control in the past; that's still a good
idea.
> BTW, instead of adding oodles of new byte code, how about
> grouping them... e.g. instead of UNARY_LEN, BUILD_RANGE, etc.
> why not have a CALL_BUILTIN which takes an index into
> a predefined set of builtin functions.
>
> The same could be done with some often used constants
> such as None, '', 1, 0: LOAD_SYSTEM_CONST with an index
> into a constants array.
>
> The advantage is that you can easily extend both sets
> of prefetched constants while not adding too many
> new new byte codes to the inner loop.
Good ideas.
> Note that the loop as it is built now is already too large
> for common Intel+compatible based CPUs. Adding even more byte
> codes to the huge single loop would probably result in a
> decrease of CPU cache hits. (I split the Great Switch
> in two switch statements and got some good results out of
> this: the first switch handles often used byte codes while the
> second takes care of the more exotic ones.)
Sigh -- I wish C compilers took care of this. I like a single switch
because it's so simple.
> A note on range() and for: the common usage of
>
> for i in range(const):
> ...
>
> could be compiled into a completely different set of opcodes
> not creating any list or tuple at all. Since the FOR_LOOP
> opcode generates loop integers on each iteration the creation
> of a range tuple or list is not needed. The loop opcode would only
> have to check for the upper bound "const".
Yes, this is what I had in mind.
> I've added a new
> counter type (basically a mutable integer type that allows
> for fast increment and decrement) to simplify this even more.
> For the curious, it's in the old patch:
>
> http://starship.skyport.net/~lemburg/mxPython-1.5.patch.gz
Or there could be something even more ad-hoc (and faster).
--Guido van Rossum (home page: http://www.python.org/~guido/)