Py2.3: Feedback on Sets

Andrew Dalke adalke at mindspring.com
Wed Aug 13 15:30:41 EDT 2003


Raymond Hettinger:
> I've gotten lots of feedback on the itertools module
> but have not heard a peep about the new sets module.

I've only just started to use them.  Plus, I didn't realize
there was a need for feedback.

> * Are you overjoyed/outraged by the choice of | and &
>    as set operators (instead of + and *)?

I read some mention of using "|" instead of "+", so I knew
to use it.  I would have liked +, but not *.  I know the logic
for thinking * but & doesn't have the other connotations
* has (like [1] * 2, "a"*9)

> * Is the support for sets of sets necessary for your work
>    and, if so, then is the implementation sufficiently
>    powerful?

Necessary is such a strong word.  After all, I've been using
Python for a long time without a set.

It's been fun to use.  I still haven't got the balance right, so
I'll start with a set then have to back it out because I need
an ordered list instead, or that I need a value so switch to
a dict.

I really, really like using ImmutableSet as a dictionary key.
The other solution is
  unique_d = {}   # or: dict.from_keys(data, 1)
  for x in data:
    unique_d[x] = 1
keys = unique.keys()
keys.sort()
dict_key = tuple(keys)

which makes the assumption that my objects can be sorted.
(__lt__ is a stronger requirement than __eq__)


> * Is there a compelling need for additional set methods like
>    Set.powerset() and Set.isdisjoint(s) or are the current
>    offerings sufficient?

I haven't had enough experience to say.  I haven't needed the
first, and think I just did "if not x&y" for the second.  I could
see choosing an "isdisjoint" for its more explicit description of
intent and its better performance.

What I did need yesterday was a way to get the elements only
in the first set, only in the intersection, and only in the second

def threeway(st1, st2):
  return st1-st2, st1&st2, st2-st1

It's easy to write - I just didn't like the 3-fold comparisons of
the elements in the sets.

> * Does the performance meet your expectations?

Haven't gotten to the stage where I'm worried about performance.

> * Do you care that sets can only contain hashable elements?

Not a bit.  I was using dicts before.

> * How about the design constraint that the argument to most
>    set methods must be another Set (as opposed to any iterable)?

I did think about

  st2 = st1 + ["some", "other", "values"]

but decided I didn't like it, so didn't try it out to see if it worked.
I'm not sure what I don't like about it.  I think it's because sets
lose order, so they aren't *quite* iterator enough to justify working
transparently with iterators.  But IANG ("I am not Guido" ) and
know that my language intuition is only grade B.

> * Are the docs clear?  Can you suggest improvements?

I haven't read the docs closely, mostly just AMK's notes, changelog,
and discussions here.  I would like if help(sets) gave more
information, since I mostly learn interactively.  OTOH, what help
does give is too much.  The signatures for all three classes are
present, which means screen upon screen of __special__
__methods__.  If the top had a summary of operators

  x&y    x.intersection(y)  -- intersection of two sets
  x|y  x.union(y)   -- union of two sets
  ...
or the example code from the docs then it would be more helpful.

> * Are sets helpful in your daily work or does the need arise
>    only rarely?

I'm still experimenting.  Eg, will I use

  unique = dict.from_keys(data).keys()

or

  import sets
   ...
  unique = list(sets.Set(data))

?  The former fits in well with pre-set Pythonic thought but the
second looks cleaner.

> User feedback is essential to determining the future direction
> of sets (whether it will be implemented in C, change API,
> and/or be given supporting language syntax).
>

It took a while to get used to "st.remove(x)" - a couple times
I did "del st[x]", which is what I would do for a dict.  It's
probably appropriate that it not be allowed, as "st[x]" doesn't
and shouldn't work, but again, IANG so I'll just point out
that the del felt natural to me.

                    Andrew
                    dalke at dalkescientific.com






More information about the Python-list mailing list