[Python-ideas] Exposing flat bytecode representation to optimizers

Sat Feb 6 14:16:05 EST 2016

On 2016-02-06 11:05:06, "Serhiy Storchaka" <storchaka at gmail.com> wrote:

>On 05.02.16 23:03, Andrew Barnert via Python-ideas wrote:
>>But why not make it even simpler and just have all unpacked
>>instructions be 32-bit? Sure, that means unpacked code arrays are
>>bigger, but it's not like the optimizers are going to be looping over
>>the same array a zillion times and worrying about cache spill (or
>>that the optimizers will be in a hotspot in the first place). Then
>>we've just got an int32*, and a jump to offset 76 is a jump to the 4
>>bytes at bytecode[76] (or, in Python, where we may still have to use
>>a bytes object, it's at worst a jump to bytecode[76<<2]).
>
>My idea was to not add new opcodes for unpacked form and keep unpacked 
>form executable. Thus we have 16-bit LOAD_CONST and 32-bit 
>LONG_LOAD_CONST, but only 16-bit POP_TOP and COMPARE_OP since POP_TOP 
>has no argument and the argument of COMPARE_OP always fits in 8 bit. 
>Unpacked form always uses long variant if it exists.
>
>Alternative variant - always use 32-bit instructions and don't pack 
>them to 8 or 16 bits. This will increase bytecode size by 4/2.73 = 1.5 
>times, but will make some parts of compiler, optimizer and interpreter 
>simpler.
>
If, at some point, we find that 32 bits aren't enough (because we're 
starting to get code objects containing more than 4 million 
instructions), we could then add a 'wide' form with 64-bit instructions. 
Just a thought...