Optimising literals away

Wed Sep 1 09:25:13 EDT 2010

On 09/01/10 17:06, Stefan Behnel wrote:
> MRAB, 31.08.2010 23:53:
>> On 31/08/2010 21:18, Terry Reedy wrote:
>>> On 8/31/2010 12:33 PM, Aleksey wrote:
>>>> On Aug 30, 10:38 pm, Tobias Weber wrote:
>>>>> Hi,
>>>>> whenever I type an "object literal" I'm unsure what optimisation
>>>>> will do
>>>>> to it.
>>>
>>> Optimizations are generally implentation dependent. CPython currently
>>> creates numbers, strings, and tuple literals just once. Mutable literals
>>> must be created each time as they may be bound and saved.
>>>
>>>>> def m(arg):
>>>>> if arg& set([1,2,3]):
>>>
>>> set() is a function call, not a literal. When m is called, who knows
>>> what 'set' will be bound to? In Py3, at least, you could write {1,2,3},
>>> which is much faster as it avoids creating and deleting a list. On my
>>> machine, .35 versus .88 usec. Even then, it must be calculated each time
>>> because sets are mutable and could be returned to the calling code.
>>>
>> There's still the possibility of some optimisation. If the resulting
>> set is never stored anywhere (bound to a name, for example) then it
>> could be created once. When the expression is evaluated there could be
>> a check so see whether 'set' is bound to the built-in class, and, if it
>> is, then just use the pre-created set.

What if the set is mutated by the function? That will modify the global
cache of the set; one way to prevent mutation is to use frozenset, but
from the back of my mind, I think there was a discussion that rejects
set literals producing a frozen set instead of regular set.

> Cython applies this kind of optimistic optimisation in a couple of other
> cases and I can affirm that it often makes sense to do that. However,
> drawback here: the set takes up space while not being used (not a huge
> problem if literals are expected to be small), and the global lookup of
> "set" still has to be done to determine if it *is* the builtin set type.
> After that, however, the savings should be considerable.
>
> Another possibility: always cache the set and create a copy on access.
> Copying a set avoids the entire eval loop overhead and runs in a C loop
> instead, using cached item instances with (most likely) cached hash
> values. So even that will most likely be much faster than the
> spelled-out code above.

I think that these kind of optimizations would probably be
out-of-character for CPython, which values implementation simplicity
above small increase in speed. Sets are not that much used and
optimizing this particular case seems to be prone to create many subtle
issues (especially with multithreading).

In other word, these optimizations makes sense for Python
implementations that are heavily geared for speed (e.g. Unladen Swallow,
Stackless Python, PyPy[1], Cython); but probably may only have a
minuscule chance of being implemented in CPython.

[1] yes, their goal was to be faster than CPython (and faster than the
speed of photon in vacuum), though AFAICT they have yet to succeed.