[Python-ideas] several different needs [Explicit variable capture list]

Tue Jan 26 23:39:03 EST 2016

On Jan 26, 2016, at 17:23, Jim J. Jewett <jimjjewett at gmail.com> wrote:
> 
>> On Tue, Jan 26, 2016 at 3:59 PM, Andrew Barnert <abarnert at yahoo.com> wrote:
>> On Jan 26, 2016, at 11:40, Jim J. Jewett <jimjjewett at gmail.com> wrote:
> 
>>> (1)  Auxiliary variables
> 
>>>   def f(x, _len=len): ...
> 
>>> This is often a micro-optimization;
> 
>> When _isn't_ it a micro-optimization?
> 
> It can improve readability, usually by providing a useful rename.

OK, but then how could FAT, or any optimizer, help with that?

>> I think if it isn't, it's a very different case, e.g.:
>> 
>>    def len(iterable, _len=len):
>>        if something(iterable): special_case()
>>        else: return _len(iterable)
> 
> I would (perhaps wrongly) still have assumed that was at least
> intended for optimization.

This is how you hook a global or builtin function with special behavior for a special case, when you can't use the normal protocol (e.g., because the special case is a C extension type so you can't monkeypatch it), or want to hook it at a smaller scope than builtin. That's usually nothing to do with optimization, but with adding functionality. But, either way, it's not something an optimizer can help with anyway.

>> Obviously non-optimization use cases can't be solved
>> by an optimizer. I think this is really more a special case
>> of your #4 ...
> 
> [#4 was current-value capture]
> 
> I almost never set out to capture a snapshot of the current
> environment's values.  I get around to that solution after being
> annoyed that something else didn't work, but it isn't the original
> intent.  (That might be one reason I sometimes have to stop and think
> about closures created in a loop.)
> 
> The "this shouldn't be in the signature" and "why is something being
> assigned to itself" problems won't go away even if current-value
> capture is resolved.  I suspect current-value capture would even
> become an attractive nuisance that led to obscure bugs when the value
> was captured too soon.

You may be right here. The fact that current-value capture is currently ugly means you only use it when you need to explicitly signal something unusual, or when you have no other choice. Making it nicer could make it an attractive nuisance.

>> But, like most micro-optimizations, you should use this
>> only when you really need it. Which means you probably
>> can't count on a general-purpose optimizer that may do it
>> for you, on some people's installations.
> 
> That still argues for not making any changes to the language; I think
> the equivalent of (faster access to unchanged globals or builtins) is
> a better portability bet than new language features.

Sure. I already said I don't think anything but maybe (and probably not) the loop-capture problem actually needs to be solved, so you don't have to convince me. :) When you really need the micro-optimization, which is very rare, you will continue to spell it with the default-value trick. The rest of the time, you don't need any way to spell it at all (and maybe FAT will sometimes optimize things for you, but that's just gravy).

> Alternatively, it might be like const contagion, that ends
> up being applied too often and just adding visual noise.

Const contagion is a C++-specific problem. (Actually, two problems--mixing up lvalue-const and rvalue-const incorrectly, and having half the stdlib and half the popular third-party libraries out there not being const-correct because they're actually C libs--but they're both unique to C++.) Play with D or Swift for a while to see how it can work.

>>> So again, I think something like Victor's FAT optimizer (plus comments
>>> when immutability really is important) is a better long-term solution,
>>> but I'm not as sure as I was for case 1.
> 
>> How could an optimizer enforce immutability, much less signal it?
> 
> Victor's guards can "enforce" immutability by recognizing when it
> fails in practice.

But that doesn't do _anything_ semantically--the code runs exactly the same way as if FAT hadn't done anything, except maybe a bit slower. If that's wrong, it's still just as wrong, and you still have no way of noticing that it's wrong, much less fixing it. So FAT is completely irrelevant here.

>  It can't signal, but comments can ... and
> immutability being semantically important (as opposed to merely useful
> for optimization) is rare enough that I think a comment is more likely
> to be accurate than a type declaration.

Here I disagree completely. Why do we have tuple, or frozenset? Why do dicts only take immutable keys? Why does the language make it easier to build mapped/filtered copies in place? Why can immutable objects be shared between threads or processes trivially, while mutable objects need locks for threads and heavy "manager" objects for processes? Mutability is a very big deal.

>>> (3)  Persistent storage
> 
>>>   def f(x, _cached_results={}): ...
> 
>>> I still think it might be nice to just have a way of easily opening a
>>> new scope ...
> 
>> You mean to open a new scope _outside_ the function
>> definition, so it can capture the cache in a closure, without
>> leaving it accessible from outside the scope? But then f won't
>> be accessible either, unless you have some way to "return"
>> the value to the parent scope. And a scope that returns
>> something--that's just a function, isn't it?
> 
> It is a function plus a function call, rather than just a function.
> Getting that name (possible several names) bound properly in the outer
> scope is also beyond the abilities of a call.  

It isn't at all beyond the abilities of defining and calling a function. Here's how you solve this kind of problem in JavaScript:

    var spam = function(n) {
        var cache = {}:
        return function(n) {
            if (cache[n] === undefined) {
                cache[n] = slow_computation(n);
            }
            return cache[n];
        };
    }();

And the exact same thing works in Python:

    def _():
        cache = {}
        def spam(n):
            if n not in cache:
                cache[n] = slow_computation(n)
            return cache[n]
        return spam
    spam = _()

You just rarely do it in Python because we have better ways of doing everything this can do.

>> Meanwhile, a C-style function-static variable isn't really
>> the same thing. Statics are just globals with names nobody
>> else can see. So, for a nested function (or a method) that
>> had a "static cache", any copies of the function would all
>> share the same cache, while one with a closure over a
>> cache defined in a new scope  (or a default parameter value,
>> or a class instance) would get a new cache for each copy.
>> So, if you give people an easier way to write statics, they'd
>> still have to use something else when they want the other.
> 
> And explaining when they want one instead of the other will still be
> so difficult that whichever is easier to write will become an
> attractive nuisance, that would only cause problems under load.

Yes, yet another strike against C-style static variables. But, again, I don't think this was a problem that needed solving in the first place.