[Python-Dev] Set options

Ka-Ping Yee ping@lfw.org
Tue, 21 Mar 2000 07:07:51 -0600 (CST)


Jeremy Hylton wrote:
> The problem with a set module is that there are a number of different
> ways to implement them -- in C using kjbuckets is one example.  Each
> approach is appropriate for some applications, but not for every one.

For me, anyway, this is not about trying to engineer a universally
perfect solution into Python -- it's about providing some simple, basic,
easy-to-understand functionality that takes care of the common case.
For example, dictionaries are simple, their workings are easy enough
to understand, and they aren't written to efficiently support things
like inversion and composition because most of the time no one needs
to do these things.

The same holds true for sets.  All i would want is something i can
put things into, and take things out of, and ask about what's inside.

Barry Warsaw wrote:
> It would seem to me that distutils is a better way to go for
> kjbuckets.  The core already has basic sets (via dictionaries).  We're
> pretty much just quibbling about efficiency, API, and syntax, aren't we?

Efficiency: Hashtables have proven quite adequate for dicts, so
i think they're quite adequate for sets.

API and syntax: I believe the goal is obvious, because Python already
has very nice notation ("in", "not in") -- it just doesn't work quite
the way one would want.  It works semantically right on lists, but
they're a little slow.  It doesn't work on dicts, but we can make it so.

Here is where my "explanation metric" comes into play.  How much
additional explaining do you have to do in each case to answer the
question "what do i do when i need a set"?


1.  Use lists.

    Explain that "include()" means "append if not already present",
    and "exclude()" means "remove if present".  You are done.


2.  Use dicts.
    
    Explain that "for x in dict" iterates over the keys, and
    "if x in dict" looks for a key.  Explain what happens when
    you write "{1, 2, 3}", and the special non-printing value
    constant.  Explain how to add elements to a set and remove
    elements from a set.


3.  Create a new type.

    Explain that there exists another type "set" with methods
    "insert" and "remove".  Explain how to construct sets.
    Explain how "in" and "not in" work, where this type fits
    in with the other types, and when to choose this type
    over other types.


4.  Do nothing.

    Explain that dictionaries can be used as sets if you assign
    keys a dummy value, use "del" to remove keys, iterate over
    "dict.keys()", and use "dict.has_key()" to test membership.


This is what motivated my proposal for using lists: it requires
by far the least explanation.  This is no surprise because a lot
of things about lists have been explained already.

My preference in terms of elegance is about equal for 1, 2, 3,
with 4 distinctly behind; but my subjective ranking of "explanation
complexity" (as in "how to get there from here") is 1 < 4 < 3 < 2.


-- ?!ng