Optimising literals away

Stefan Behnel stefan_ml at behnel.de
Wed Sep 1 03:06:00 EDT 2010


MRAB, 31.08.2010 23:53:
> On 31/08/2010 21:18, Terry Reedy wrote:
>> On 8/31/2010 12:33 PM, Aleksey wrote:
>>> On Aug 30, 10:38 pm, Tobias Weber wrote:
>>>> Hi,
>>>> whenever I type an "object literal" I'm unsure what optimisation
>>>> will do
>>>> to it.
>>
>> Optimizations are generally implentation dependent. CPython currently
>> creates numbers, strings, and tuple literals just once. Mutable literals
>> must be created each time as they may be bound and saved.
>>
>>>> def m(arg):
>>>> if arg& set([1,2,3]):
>>
>> set() is a function call, not a literal. When m is called, who knows
>> what 'set' will be bound to? In Py3, at least, you could write {1,2,3},
>> which is much faster as it avoids creating and deleting a list. On my
>> machine, .35 versus .88 usec. Even then, it must be calculated each time
>> because sets are mutable and could be returned to the calling code.
>>
> There's still the possibility of some optimisation. If the resulting
> set is never stored anywhere (bound to a name, for example) then it
> could be created once. When the expression is evaluated there could be
> a check so see whether 'set' is bound to the built-in class, and, if it
> is, then just use the pre-created set.

Cython applies this kind of optimistic optimisation in a couple of other 
cases and I can affirm that it often makes sense to do that. However, 
drawback here: the set takes up space while not being used (not a huge 
problem if literals are expected to be small), and the global lookup of 
"set" still has to be done to determine if it *is* the builtin set type. 
After that, however, the savings should be considerable.

Another possibility: always cache the set and create a copy on access. 
Copying a set avoids the entire eval loop overhead and runs in a C loop 
instead, using cached item instances with (most likely) cached hash values. 
So even that will most likely be much faster than the spelled-out code above.

Stefan




More information about the Python-list mailing list