The Cost of Dynamism (was Re: Pyhon 2.x or 3.x, which is faster?)

Steven D'Aprano steve at pearwood.info
Sun Mar 13 05:39:23 EDT 2016


On Sun, 13 Mar 2016 04:54 am, BartC wrote:

> On 12/03/2016 16:56, Steven D'Aprano wrote:
>> On Sun, 13 Mar 2016 12:42 am, BartC wrote:
>>
>>> Ad-hoc attributes I don't have as much of a problem with, as they can be
>>> handy. But predefined ones also have their points. (For one thing, I
>>> know how to implement those efficiently.)
>>>
>>> However, when you have a function call like this: M.F(), where M is an
>>> imported module, then it is very unlikely that the functions in M are
>>> going to be created, modified, deleted or replaced while the program
>>> runs. [I mean, after the usual process of executing each 'def'
>>> statement.]
>>
>> What do you consider "very unlikely"? And how do you know what people
>> will choose to do?
> 
> Common sense tells you it is unlikely.

Perhaps your common sense is different from other people's common sense. To
me, and many other Python programmers, it's common sense that being able to
replace functions or methods on the fly is a useful feature worth having.
More on this below.

Perhaps this is an example of the "Blub Paradox":

http://www.paulgraham.com/avg.html

Wherever we sit in the continuum of language power, we look *down* at
languages with less power as "crippled" because they miss features we use
all the time, and *up* at languages with more power as adding a lot of
hairy and weird features that nobody would ever need.


>>> Why then should it have to suffer the same overheads as looking up
>>> arbitrary attributes? And on every single call?
>>
>> Because they *are* arbitrary attributes of the module. There's only one
>> sort of attribute in Python. Python doesn't invent multiple lookup rules
>> for attributes-that-are-functions, attributes-that-are-classes,
>> attributes-that-are-ints, attributes-that-are-strings, and so on. They
>> are all the same.
>>
>> You gain a simpler implementation,
> 
> (Have you tried looking at the CPython sources? I tried last year and
> couldn't head or tail of them. What was the layout of the pyObject
> struct? I couldn't figure it out, the source being such a mess of
> conditional code and macros within macros.

Right. Now add *on top of that complexity* the extra complexity needed to
manage not one, but *two* namespaces for every scope: one for "variables"
and one for "functions and methods".

I'm not defending the CPython source. I can't even judge the CPython source.
My ability to read C code is a bit better than "See Spot run. Run, Spot,
Run!" but not that much better. But CPython is a 20+ year language, and the
implementation has no doubt built up some cruft over the years, especially
since many of the implementation details are part of the public C
interface.


[...]
>> In languages where functions are different from other values, you have to
>> recognise ahead of time "some day, I may need to dynamically replace this
>> function with another" and write your code specially to take that into
>> account, probably using some sort of "Design Pattern".
> 
> No it's very easy. In Python terms:
> 
> def f(): return "One"
> def g(): return "Two"
> 
> h=f
> 
> h() returns "One". Later you do h=g, and h() returns "Two". No need for
> f and g themselves to be dynamic. h just needs to be a variable.

You're still not getting the big picture. I didn't say that it was
necessarily difficult create such a "dynamic function". I said that you had
to realise *ahead of time* that you needed to do so.

Let me see if I can draw you a more complete picture. Suppose I have a
function that relies on (let's say) something random or unpredictable:


def get_data():
    return time.strftime("%S:%H")

Obviously this is a toy function, consider it as a stand-in for something
more substantial. I don't know, maybe it connects to a database, or gathers
data from a distant web site, or interacts with the user.

Now I use `get_data` in another function:

def spam():
    value = get_stuff().replace(":", "")
    num = int(value)
    return "Spam to the %d" % num


How do I test the `spam` function? I cannot easily predict what the
`get_data` function will return.

In Python, I can easily monkey-patch this for the purposes of testing, or
debugging, by introducing a test double (think of "stunt double") to
replace the real `get_data` function:


import mymodule
mymodule.get_data = lambda: "1:1"
assert spam() == "Spam to the 11"


This "test double" is sometimes called a stub, or a mock, or a fake.
Whatever you call it, it is *trivially* easy in Python.

How would you do it, when functions are constant? You would have to re-write
the module to allow it:


def real_get_data():
    return time.strftime("%S:%H")

replaceable_get_data = real_get_data

def spam():
    value = replaceable_get_data().replace(":", "")
    num = int(value)
    return "spam"*num


Now you have two classes of functions: those that can be replaced by test
doubles and those that can't. Those that can be replaced exist in two
almost identical versions: the real, crippled, function, and the special,
mockable, function. Your users have to learn which to use, and why.

If you want to patch or mock something which the module author didn't think
of in advance, you are all out of luck.

Now multiple by your entire library, potentially hundreds or thousands of
functions.

I once monkey-patched the `len` built-in so I could monitor the progress of
a long-running piece of code that wasn't written to give any feedback. I
wouldn't do that in production, of course, but I was debugging and that was
a nice, simple, clean way to get the result I needed: each time through the
loop, my function called `len` and the patched version printed status so I
could see where it was up to. Why I was done, I deleted the monkey-patch,
and never once needed to edit the main function being called inside the
loop.



>>> When you dabble with lots of little things, then they can add up. To the
>>> point where an insignificant optimisation can become significant.
>>
>> Of course. Reduced runtime efficiency is the cost you pay for the
>> flexibility gained by significant dynamism. It's a trade-off between
>> efficiency, convenience, simplicity, etc. It's quite legitimate for
>> language designers to choose to put that trade-off in different places,
>> or indeed for the trade-off to change over time.
> 
> Maybe the designer(s) of Python didn't know how popular it would get.
> 
> Do you think some of the design decisions would be different now?

I think that there are certainly people in the Python community that would
have done things differently if they had been around back in the beginning.
But I don't think Guido is one of them. He's not adverse to people speeding
Python up, but I don't think he would be willing to give the advantages for
testing and debugging when he could just wait another year for CPU speeds
to increase.



-- 
Steven




More information about the Python-list mailing list