[Python-ideas] RFC: PEP: Add dict.__version__

Terry Reedy tjreedy at udel.edu
Mon Jan 11 01:04:02 EST 2016


On 1/10/2016 12:23 AM, Chris Angelico wrote:

(in reponse to Steven's response to my post)

> There's more to it than that. Yes, a dict maps values to values; but
> the keys MUST be immutable

Keys just have to be hashable; only hashes need to be immutable.  By 
default, hashes depends on ids, which are immutable for a particular 
object within a run.

(otherwise hashing has problems),

only if the hash depends on values that mutate.  Some do.

> and this optimization

 > doesn't actually care about the immutability of the value.

astoptimizer has multiple optimizations. One is not repeating name 
lookups. This is safe as long as the relevant dicts have not changed. I 
am guessing that you were pointing to this one.

Another is not repeating the call of a function with a particular value. 
This optimization, in general, is not safe even if dicts have not 
changed.  It *does* care about the nature of dict values -- in 
particular the nature of functions that are dict values.  It is the one 
*I* discussed, and the reason I claimed that using __version__ is tricky.

His toy example is replacing conditionally replacing 'len('abc') (at 
runtime) with '3', where '3' is computed *when the code is compiled. 
For this, it is crucial that builtin len is pure and immutable.

Viktor is being super careful to not break code.  In response to my 
question, Viktor said astoptimizer uses a whitelist of pure builtins to 
supplement the information supplied by .__version__.  Dict history, 
summarized by __version__ is not always enough to answer 'is this 
optimization safe'?  The nature of values is sometimes crucially important.

However, others might use __version__ *without* thinking through what 
other information is needed.  This is why I think its exposure is a bit 
dangerous.  19 years of experience suggests to me that misuse  *will* 
happen.  Viktor just reported that CPython's type already has a 
*private* version count.  The issue of exposing a new internal feature 
is somewhat separate and comes after the decision to add it.

As you know, and even alluded to later in your post, CPython already 
replaces '1 + 1' with '2' at compile time.  Method int.__add__ is pure 
and immutable.  Since it (unlike len) also cannot be replaced or 
shadowed, the replacement can be complete, with '2' put in the code 
object (and .pyc if written), as if the programmer had actually written '2'.

 >>> from dis import dis
 >>> dis('1 + 1')
   1           0 LOAD_CONST               1 (2)
               3 RETURN_VALUE

JIT compilers depend on the same properties of int, float, and str 
operations, for instance, as well as the fact that unbox(Py object) and 
box(machine value) are inverses, so that unbox(box(temp_machine_value) 
can be replaced by temp_machine_value.

-- 
Terry Jan Reedy



More information about the Python-ideas mailing list