Tail recursion to while iteration in 2 easy steps

Steven D'Aprano steve+comp.lang.python at pearwood.info
Wed Oct 2 22:34:15 EDT 2013


On Thu, 03 Oct 2013 02:46:53 +0100, MRAB wrote:

> On 03/10/2013 02:39, Dave Angel wrote:
>> On 2/10/2013 21:24, Steven D'Aprano wrote:
>>
>>> On Wed, 02 Oct 2013 18:17:06 -0400, Terry Reedy wrote:
>>>
>>>> CPython core developers have be very conservative about what
>>>> tranformations they put into the compiler. (1,2,3) can always be
>>>> compiled as a constant, and so it is. [1,2,3] might or might not be a
>>>> constant, depending on the context, and no attempt is made to analyze
>>>> that.
>>>
>>> The first sentence of this is correct. The next two don't quite make
>>> sense to me, since I don't understand what you mean by "constant" in
>>> this context. I *think* you might be referring to the LOAD_CONST
>>> byte-code, which in Python 3.3 understands tuples like (1, 2, 3), but
>>> not lists. So a literal (1, 2, 3) gets created at compile-time with a
>>> single LOAD_CONST call:
>>>
>>> py> from dis import dis
>>> py> dis(compile("x = (1, 2, 3)", '', 'exec'))
>>>   1           0 LOAD_CONST               4 ((1, 2, 3))
>>>               3 STORE_NAME               0 (x)
>>>               6 LOAD_CONST               3 (None)
>>>               9 RETURN_VALUE
>>>
>>>
>>> while a literal [1, 2, 3] does not:
>>>
>>>
>>>
>> The difference is that a tuple can be reused, so it makes sense for the
>> comiler to produce it as a const.  (Much like the interning of small
>> integers)  The list, however, would always have to be copied from the
>> compile-time object.  So that object itself would be a phantom, used
>> only as the template with which the list is to be made.
>>
> The key point here is that the tuple is immutable, including its items.

You are both assuming that LOAD_CONST will re-use the same tuple 
(1, 2, 3) in multiple places. But that's not the case, as a simple test 
will show you:


# Python 3.3

py> def f():
...     a = (1, 2, 3)
...     b = (1, 2, 3)
...     return a is b
...
py> f()  # Are the tuples the same object?
False
py> from dis import dis
py> dis(f)
  2           0 LOAD_CONST               4 ((1, 2, 3))
              3 STORE_FAST               0 (a)

  3           6 LOAD_CONST               5 ((1, 2, 3))
              9 STORE_FAST               1 (b)

  4          12 LOAD_FAST                0 (a)
             15 LOAD_FAST                1 (b)
             18 COMPARE_OP               8 (is)
             21 RETURN_VALUE



So even though both a and b are created by the same LOAD_CONST byte-code, 
the object is not re-used (although it could be!) and two distinct tuples 
are created.

Re-use of objects (caching or interning) is an implementation 
optimization, not a language feature. An implementation might choose to 
cache ints, tuples, strings, floats, or none of them at all. That in no 
way affects whether the LOAD_CONST byte-code is used.

In fact, LOAD_CONST may behave differently inside functions than outside: 
in CPython, functions will cache some literals in the function attribute 
__code__.co_consts, and LOAD_CONST may use that, while outside of a 
function, only small ints and identifier-like strings are cached but very 
little else. Other implementations may do differently -- I'm not sure 
whether __code__ is a public language feature or an implementation 
feature, but what goes into co_consts certainly isn't fixed. IronPython 
2.6 doesn't appear to put anything in co_consts except None.

And of course, *all of this is subject to change*, since it is not 
language semantics but implementation details. If HyperPython8.5 
aggressively caches every tuple it can, while SimplePython1.1 uses 
BUILD_TUPLE rather than LOAD_CONST, both are still perfectly compliant 
Python implementations.

(HyperPython probably will require a huge amount of memory, and 
SimplePython will probably be slow, but those are quality of 
implementation issues.)


Aside: a sufficiently smart optimizing compiler could optimize function f 
above to either 

    LOAD_CONST    (True)
    RETURN_VALUE

or

    LOAD_CONST    (False)
    RETURN_VALUE


and still be a correct Python implementation. Since Python the language 
doesn't specify when, if ever, objects should be cached, it could even 
randomly choose between those two options at compile time, or even at 
runtime, and still be correct.


-- 
Steven



More information about the Python-list mailing list