[Python-ideas] One obvious way to do interning [Was: Retrieve an arbitrary element from a set without removing it]

Alexander Belopolsky alexander.belopolsky at gmail.com
Mon Oct 26 17:54:37 CET 2009


Changing the subject to reflect branched discussion and forwarding to
python-ideas where it probably belongs.

On Mon, Oct 26, 2009 at 12:02 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> Alexander Belopolsky wrote:
>
>> Here is an alternative idea on how storing interned objects in a set
>> can be supported.  Currently set.add method returns None and has no
>> effect when set already has an object equal to the one being added.  I
>> propose to consider changing that behavior to make set.add return the
>> added object or the set member that is equal to the object being
>> added.  It is unlikely that many programs rely on the return value
>> being None (with doctests being a probable exception), so adding this
>> feature is unlikely to cause much grief.
>
> I had exactly the same idea, but did not post because it violates the
> general rule that mutators return None.

Is there such a rule?  What about set/dict pop?

> On the other hand, the returned
> value here would not be the mutated collection, so no chaining is possible.

I assume you refer to chaining as in s.add(1).add(2) which would be
enabled if s.add(..) returned s.  My proposal would enable a different
type of "chaining" which I would find useful, but ready to hear
objections:

v = compute_value()
s.add(v)
# use v

can, with my proposal, be rewritten as v = s.add(compute_value()) with
an added benefit that v that is used is the "canonical" value.

> And 'add' is clearly intended to change something.
>
Is this an argument for or against the proposal?

> On the other hand, frozensets do not have an add method.

However, PySet_Add "works with frozenset instances (like
PyTuple_SetItem() it can be used to fill-in the values of brand new
frozensets before they are exposed to other code). "
<http://docs.python.org/3.1/c-api/set.html#PySet_Add>

I will experiment with changing  PySet_Add to see how much it would
break and whether it will be helpful in implementing set-based
interning of python strings.



More information about the Python-ideas mailing list