[PATCH] Re: frozenset() without arguments should return a singleton

Stefan Behnel stefan.behnel-n05pAM at web.de
Sat Feb 12 17:21:58 EST 2005



Raymond Hettinger wrote:
 >Stefan Behnel wrote:
>>I stumbled over the fact that 'frozenset()' doesn't return a constant but
>>creates a new object everytime. Since it is immutable, I wrote to  c.l.py 
>>that this behaviour is different from what tuple() & Co do.
> 
> It is not quite correct to say that this is what all immutables do:
> 
> 
>.>>>x = 500
>.>>>y = 600 - 100
>.>>>x is y
> False

I know. The same is true for concateneted strings, etc. But whenever an 
immutable object is created directly ('by hand'), it holds. It also holds, 
btw, for tuple() - as opposed to ().


> For tuples, it is an optimization of a frequent use case (internally,
> empty tuples are passed around for empty argument lists).

I definitely see the use. When I tried to figure out how to implement the 
patch, I looked at the source of tuple objects and saw that there is quite a 
bunch of cached constants. I'm pretty sure such optimizations are not 
necessary for sets.


> Do you have real use cases for frequent creation of frozenset([])?  I
> would be interested in seeing how the application.  As designed, the
> principal use case for frozensets was in implementing sets of sets.  The
> secondary case was for using sets as dictionary keys.  Neither of those
> use cases entails storing more than one instance of frozenset([]).

I actually use frozensets whenever I know that my set is going to be immutable 
(I thought that was what they were meant for).

And similar to the usage of tuples as replacements for empty lists, I 
definitely pass a frozenset whenever I need a dummy set-like object.

If I know that frozenset() is not constant, I may end up with keeping a dummy 
reference around somewhere that is passed instead of a 'new' frozenset(). But 
that is definitely more ugly than making frozenset() constant internally.

I actually see the difference between
constant frozenset()
and
constant frozenset([])
and the like. Just imagine things like
frozenset(for i in [False]*1000 if i)
I wouldn't want a guarantee that that one returns a singleton. I guess that 
check would really make it inefficient: create the object, start adding 
nothing, finish adding nothing, check if anything was added, discard object, 
return singleton. Brrrrr...

But frozenset() should still be the most common case of creating empty sets.

I think the main use case is passing them instead of sets whenever a method 
interface demands a set-like object for read-only access or wants to return an 
empty set. Tuples cannot always replace this.

I started using sets rather frequently in my source and frozenset() tends to 
turn up relatively often.


> I'll ponder your idea for a bit.  As it stands, the patch needs more
> work (to remove the singleton just before Python exits -- see similar
> operations for freelists).

I kinda figured that. Would have been too easy, then... :)


> I don't mind finishing the patch but am not
> sure it is the right thing to do.  It is essentially an optimization of
> one case at the expense of added code and of slightly de-optimizing
> other cases.

Very slightly, I'd say. The only case that eats a couple of instructions at 
creation time is 'new' being called from a subclass. I wouldn't expect many 
people to subclass frozenset - but then, that's just me...

Also, I don't know what else needs to be changed and what a difference that 
makes. But if the added code is reasonably limited, I'd vote for it.

Stefan



More information about the Python-list mailing list