Interface of the set classes

Alex Martelli aleaxit at yahoo.com
Fri Oct 29 11:29:06 EDT 2004


Pierre Barbier de Reuille <pierre.barbier at cirad.fr> wrote:
   ...
> >>I don't understand why the sets has an interface of mapping object 
> >>without associated value and not an interface of sequence ?
> > 
> > Sequences have order, sets don't.
> 
> Yes, and ? I don't understand why that's a reason ? If you take the C++
> STL, the sets implements the sequence concept but without the impossible
> things to do

C++'s std::set<...> is ordered (and so is STL's, both at the time in
which the C++ standard took inspiration from it, and in its long-since
diverged state).

> ... so if you don't need the order in your list you can 
> switch freely from vector to list to sets depending on the properties or
> efficiency you want.

Nope -- to use std::set<T>, you need to be able to define a < comparator
that obeys the required semantic rules (either as an operator< on T, or
separately in various possible ways); to use std::vector<T>, you don't.
So, you cannot "switch freely" -- there are constraints.

That's because a std::set<...> is based on ordering (a Python's set is
based on hashing, so it imposes different constraints -- namely, that
the items be hashable, rather than having them be comparable).

Moreover, the set of valid operations on a std::vector<T>, call it V, is
extremely different from the set of valid operations on an std::set<T>,
call it S.  You can write V[3], but you can't write S[3], for example.
Here, too, your assertion that "you can switch freely" is utterly
baffling.

> I don't know why this is different in Python ! 

It's not: neither in Python nor in C++ can you switch freely.

> Using "add" instead of "append" and "update" instead of "extend" does
> not makes sense to me.

Try, in C++'s example above mentioned, calling S.push_back(...), then...
does the inability to use *THAT* make sense to you?  It sure does to
_me_: V has an order based on insertion, so "pushing a new item at the
back" makes sense (though Python's 'append' terminology is clearer).  S
doesn't (no order in Python, order based on item values in C++), so it
makes no sense to even think of "pushing onto the back" (or equivalently
"appending").  Basically, the commonality between S and V is: you can
loop on either (net of order &c issues).  And that holds in C++ as well
as in Python.

> It's even worse because Python rely more on 
> interface than in inheritence ... so making classes similar helps a lot
> reusability.

Not when the semantics are too different, as in the case of S and V.

> >>Then, why can't you at all access sets element by position ? As Python
> >>iterator are so poor (mainly, you cannot copy them ...) you have things
> > 
> > That's what itertools.tee is for (in Python 2.4).
> 
> Oky, thanks ^_^ So I should not need the __getitem__ anymore in sets.

Well, you surely didn't get operator[] on C++'s std::set<...>s, that's
for sure...

> > It would be false (in 2.4) -- set is a builtin type, side by side with
> > list and dict, not relying on either of them.
> 
> Yes, I know it will become a built-ins type ! But that does not explain

It is one now -- just download 2.4.

> this interface incompatible with lists. I would expect maximum 
> compatibility to be able to exchange the types. Even compatibility 
> between dictionnary and list is usefull ... because at some point you
> can consider a dictionnary as a list of tuples. And then, too bad you
> cannot because of simple design decisions.

You make claims, above, about the container templates in C++'s standard
library, that leave me in doubt.  The very fact that you make strong
assertions without even hedging them in any way appears to indicate you
are experienced with those templates and familiar with them -- yet what
you say is false, appearing to indicate you aren't.  So, I don't know if
I can take for granted all the rich conceptual framework of generic
programming, and just point to the fact that the interface differences
between Python's sequences, mappings and sets reflect exactly their
conceptual semantic differences -- but then, that should be prima facie
obvious!; or, do I need to try to forget the fact that you ever
mentioned C++, and try to teach and explain both the conceptual
underpinnings and the practical implications of generic programming (or
more generally signature-based polymorphism) in purely Pythonic terms.

You can probably clarify your background, and therefore which approach
would help you most, more effectively, than I can just "stab in the
dark" at it.  Do you know generic programming, and the offerings in
C++'s standard library, solidly and in-depth?  If so, do you feel that
C++'s standard library offers you the right amount of polymorphism among
vectors, sets and maps, while you perceive a lesser amount among
Python's lists, sets and dicts, and in what aspects precisely?  Whether
this is the case, or not, can you present a simplified problem where you
feel there isn't enough polymorphism for your purposes?  I may perhaps
be able to offer help in any case, both practical and theoretical, but
there is such a huge gulf among the various possibilities here that I
think it's probably better for you to explain exactly where you come
from and what you're after, first -- a theoretical background to help
you understand the rationale behind the choices, a practical approach to
use the best solutions for specific problems, etc, etc...!


> No, what I want is not a new container type. I just want to be able to
> change the container I use without changing all my codes ! It's too 

It depends on what operations you're doing on that container, of course.

> expensive. At the end, I finish using sets and hashsets from the STL 
> exported to Python as sequences. The efficiency is the same and it just
> works great :) I just feel bad not using the "standard" sets class ...

But -- there IS no operator[] in the C++ standard library's sets!  And
hashset (spelled with an underscore, I believe) is indeed a SGI
extension (so, in STL stricto sensu), but it's not in the C++ standard,
and AFAIK doesn't support indexing either...


Alex



More information about the Python-list mailing list