The Cost of Dynamism (was Re: Pyhon 2.x or 3.x, which is faster?)

Fri Mar 11 16:36:07 EST 2016

On Sat, Mar 12, 2016 at 5:57 AM, BartC <bc at freeuk.com> wrote:
> Anyway, I've listed some of the stumbling blocks I think that Python has in
> making it bit faster: http://pastebin.com/WfUfK3bc
>

Much easier to discuss text that's inline, so I'm taking this out of pastebin.

> Functions are dynamic. That is, you don't know the F in F() is actually a function, even if it was defined a few lines previously, because it could have been rebound to something else. That means a runtime check to ensure that it is one.
>
> * Dynamic functions mean the compiler doesn't know how many parameters a function takes. This needs checking at runtime, and something done about too many or too few arguments.
>
> * The compiler doesn't know the names of the parameters in the function it's calling. So keyword arguments need to be dealt with at runtime.
>
> * The compiler doesn't know about parameter default values, so again this needs to be checked and dealt with at runtime
>

All of these are really one thing: When you compile some code, you
don't know whether F() is still going to be the same object when you
run it. FAT Python is intended to pick this up; it recognizes
constants, and can then perform all the other optimizations you
describe (since functions are themselves immutable, all you need is an
identity check on the function object itself, and everything else you
describe can be done safely).

> * Top-level names (eg. the 'A' in A.B.C), which generally refer to a defined function, class or module, or to a variable, can additionally be deleted or added at runtime. This means a look-up for such names (LOAD_GLOBAL), if they are not local to a function.
>
> * Module names are dynamic. So for a call such as M.F(), M needs to be looked up (M could be a number of things), and then an attribute lookup for F needs to be done before a call can be made. Then you have those other matters above.
 >
> (Static modules and functions would mean that M.F() can be resolved at compile-time. No name or attribute lookup needed, and in fact the function can be an operand of a Call byte-code with no need to load to the stack first.)
>
> * Dynamic modules mean it is not possible to optimise expressions using math.sqrt() for example, as both math and sqrt could have been changed.
>

Not sure what the significance of this is. Since they can be rebound
at run-time, you need a lookup anyway. The only way to avoid that
lookup is to constant-fold them, which would break anything that
*ever* rebinds globals. But again, FAT Python is looking into this
(since the rebinding of globals is unusual). The same thing applies to
module attributes, since your globals are simply the attributes of
your module, and "math.sqrt()" is just a special case of M.F().

> * Variables are dynamic. So you can't for example declare (with a new const feature) 'const rows=50', which would allow you to code 'LOAD_CONST' instead of 'LOAD_GLOBAL', or even to eliminate it completely as some expressions could be folded.
>
> * Enum names are dynamic. There are a number of ways of having enumerations, but they are all going to suffer from the same problem of names possibly being rebound to something else. That means executing LOAD_GLOBAL or doing attribute lookups, instead of LOAD_CONST.
>

Definitely agree with this. Having a way to declare that a name is
"truly constant" would be extremely handy; there currently isn't a
way, and I'm not sure whether FAT Python is looking into this or not.
I think it's more focusing on the situation where something simply has
never been rebound, which is probably the better optimization; but
sometimes it'd be nice to be able to assert that something *will not*
be rebound, to help with self-documenting code. (Note that "constant"
implies both "never rebound" and "immutable". I actually don't care
about the latter; you could, for instance, declare that "sys.path" is
a constant list, which means you're not allowed to replace it with a
different list, but are allowed to append to it or remove from it.)

>  * Without a way of defining compile-time constants, very useful language features such as 'switch' are not practical.

Maybe, but I honestly don't miss 'switch' all that often - and when I
do, it's usually because I want a range. Consider this
semi-hypothetical Pike code:

switch (key)
{
    case 'A'..'Z': case 'a'..'z': return "Letter";
    case 0xFF00..0xFFFF: return "Special key";
    case '.': case ',': case '!': return "Punctuation";
    default: return "Other";
}

With a small number of cases, this wouldn't be too bad in if/elif
form; but as it gets bigger, the value of a switch block becomes
evident. The normal advice in Python is "use a dispatch dictionary",
but that would be impractical here. So this would end up being a
massive if/elif tree, because there's just no better way to do this.

> * Python 3 has decided to have a single 'long' type for integers, instead of separate int (32 or 64-bit) and long types. This gives a noticeable slow-down in programs executing lots of integer code, compared with Python 2.
>

Actually no, this isn't a language problem - the unified type isn't
the problem here. (How can I say this for certain? Because Pike has a
single 'int' type which is a machine word if it can be, and a GMP
arbitrary-precision integer otherwise.) If someone wanted to put in
the work, CPython could have its integer type have, *in itself*, this
optimization; it'd have no externally-visible effect other than
performance.

That said, though: I suspect the reason nobody's gone and done this is
not that it wouldn't be any benefit, but that applications doing a lot
of integer code are usually going to see more benefit from Numpy,
Cython, or PyPy, and it'll be worth making that switch. So the "real
benefit" to general CPython execution wouldn't be as much as you might
think, and nobody's picked up one of those circular tuits.

> * Attribute names are completely arbitrary. Any attribute of any name can be added to a class or class instance at any time, and probably removed too (I don't know much about Python classes). This requires a name lookup. (Other languages may pre-define a fixed set of field and method names, which simplifies the resolving required at runtime.
>

See above regarding modules. This is the same issue again.

> * CPython seems to use a reference-counting scheme for all kinds of objects, including simple value types (int and float for example, although Python 3 appears to have eliminated int as I said above).

Py3 hasn't eliminated int; and yes, all "simple value types" in Python
are still boxed values. This means that there is literally NO value
that you can pass to this function that it won't be able to handle:

def display(obj):
    print("Type:", obj.__class__.__name__)
    print("Attributes available:", len(dir(obj)))
    descr = repr(obj)
    if len(descr) > 80:
        descr = descr[:60] + " ... " + descr[-15:]
    print("Description:")
    print(descr)

Being able to depend on "everything is an object" is a huge benefit in
Python. (BTW, "reference counting" is a red herring here. Not all
Python implementations use refcounting, and it has nothing to do with
this issue; the significance is that even ints and floats are
objects.)

> * The use of conditional imports, conditional def and class statements, using eval() and exec(), even writing source code out at runtime before loading that via an import, and conditional rebinding of names to other things, mean that static analysis of source code, to get around some of the issues, is much harder.
>

Again, new attributes can be created dynamically, even of modules,
which makes new globals. FAT Python deals with this by not caring
about the specifics you're talking about, but only asking one
question: "Has this been rebound?".

> * I may be mistaken, but numeric constants such as 'A' (ie. 65 in ASCII) cannot be written in Python. 'A' is a string. Using ord('A') will work, but that's a runtime operation (as 'ord' might have been reassigned as something else. It can't generate LOAD_CONST (65)).
>

You're not mistaken. There are no "character constants" in Python.
(Note that the definition would be Unicode codepoints, rather than
ASCII values.) I don't often miss them, though.

> * I think that being unable to do proper in-place modifying of strings (in the form of s+="a") also  has an effect on optimising. In any case, it has a big effect on the benchmark shown below.
>

Hmm, I would agree, but the other way around: the immutability of
strings is a significant _benefit_ to optimization. Without it,
strings would have to be copied at all sorts of places, lest it be
changed out from under you.

> * Even what appear to be names that are a fixed part of the language, such as 'range', 'int', and the 'ord' mentioned above, are actually identifiers that the user can change. So int(s) might convert a string to a number, or it could be any function call at all.
>

Yet another case of the same problem that FAT Python is trying to
solve - that anything can be rebound, but usually nothing will be.

> * math.sqrt has been mentioned; such basic operations would benefit from being an inherent part of the language that cannot be changed.:

Not sure how fundamental square-rooting really is, but whatever. Yes,
stuff can get rebound, so you can't assume. FAT Python takes the
option of check-then-assume.

> The String Append Benchmark
>
> This is a microbenchmark, but makes use of a technique I use extensively (creating a file for example by growing a string a character at a time).
>
> def test():
>     s=""
>     for i in range(10000000):
>         s+="*"
>     print (len(s))
>
> test()
>
> Both Python 2 and 3 take around 14 seconds to complete this.

Not sure what your computer is, but mine takes half a second. I added
another zero to it, and also "try: range = xrange; except NameError:
pass" so Python 2 wouldn't be penalized by having to construct a
massive list (which doubled the Py2 time, after the length increase).

This is a benchmark that will generally perform TERRIBLY on any
language with an immutable string type. But try, instead, this
version:

try: range = xrange
except NameError: pass
def test():
    s=[]
    for i in range(100000000):
        s.append("*")
    s = "".join(s)
    print (len(s))

test()

For anything other than individual characters, this is likely to
perform better. However, I was unable to see this, because *CPython
does optimize string appends*. It's not a language flaw if an
invisible optimization can eliminate it.

ChrisA