Interface of the set classes

Pierre Barbier de Reuille pierre.barbier at cirad.fr
Fri Oct 29 12:19:58 EDT 2004


Ok, I'll try to answer and better explain what I want to understand.

Alex Martelli a écrit :
> Pierre Barbier de Reuille <pierre.barbier at cirad.fr> wrote:
>    ...
> 
> Nope -- to use std::set<T>, you need to be able to define a < comparator
> that obeys the required semantic rules (either as an operator< on T, or
> separately in various possible ways); to use std::vector<T>, you don't.
> So, you cannot "switch freely" -- there are constraints.

Right ... but take hash_set and you have exactly the same kind of "set". 
In my case, I already had the '<' operator ... so it wasn't a problem. 
Then, you'll argue that you need to be hashable. True, but as I said (I 
don't mean here I was clear about that :o) ), the choice of the 
container usually depend on the data set.

> 
> That's because a std::set<...> is based on ordering (a Python's set is
> based on hashing, so it imposes different constraints -- namely, that
> the items be hashable, rather than having them be comparable).
> 
> Moreover, the set of valid operations on a std::vector<T>, call it V, is
> extremely different from the set of valid operations on an std::set<T>,
> call it S.  You can write V[3], but you can't write S[3], for example.
> Here, too, your assertion that "you can switch freely" is utterly
> baffling.

Yes, that's why I needed __getitem__ only because some things cannot be 
done (in Python 2.3) without.

> 
> 
>>I don't know why this is different in Python ! 
> 
> 
> It's not: neither in Python nor in C++ can you switch freely.

True, but in part ^_^ In C++, you can easyly design algorithm working 
with whatever sequence you have. But for that you'll have to use 
templates, iterators, probably tags, and on. But it's possible ... I 
don't see a way to do it in Python ... and that's my whole point.

> 
> 
>>Using "add" instead of "append" and "update" instead of "extend" does
>>not makes sense to me.
> 
> 
> Try, in C++'s example above mentioned, calling S.push_back(...), then...
> does the inability to use *THAT* make sense to you?  It sure does to
> _me_: V has an order based on insertion, so "pushing a new item at the
> back" makes sense (though Python's 'append' terminology is clearer).  S
> doesn't (no order in Python, order based on item values in C++), so it
> makes no sense to even think of "pushing onto the back" (or equivalently
> "appending").  Basically, the commonality between S and V is: you can
> loop on either (net of order &c issues).  And that holds in C++ as well
> as in Python.

Yes, but use insert_iterator and you can insert with both vector and set 
with the same syntax ... even if, obviously, the position you gave for 
set will be ignored. So :

   inserter(result, result.end()) = 3;

will insert 3 into result, at the end if result is "ordered" (as the 
list in python), or where it has to be for "sorted" or "unordered" 
containers. Genericity in C++ comes from template functions most of the 
time.

> 
> 
>>It's even worse because Python rely more on 
>>interface than in inheritence ... so making classes similar helps a lot
>>reusability.
> 
> 
> Not when the semantics are too different, as in the case of S and V.

Ok, that's something I don't agree ... for me there is a common semantic 
between S and V which is "storing values in a linear container". After 
that, you can put the properties you need in that container, but if I 
need to write an algorithm whose only assumption is "a linear container" 
I want to be able to use whatever linear container I have ... and I 
don't want to reimplement it for each.

> 
> 
>>>>Then, why can't you at all access sets element by position ? As Python
>>>>iterator are so poor (mainly, you cannot copy them ...) you have things
>>>
>>>That's what itertools.tee is for (in Python 2.4).
>>
>>Oky, thanks ^_^ So I should not need the __getitem__ anymore in sets.
> 
> 
> Well, you surely didn't get operator[] on C++'s std::set<...>s, that's
> for sure...

Well, no ... but it's not needed ^_^ And if I really need it you have 
the "advance" template function that do the __getitem__ stuff in the 
most efficient way, whatever your _linear_ container is.

> 
> 
>>>It would be false (in 2.4) -- set is a builtin type, side by side with
>>>list and dict, not relying on either of them.
>>
>>Yes, I know it will become a built-ins type ! But that does not explain
> 
> 
> It is one now -- just download 2.4.
> 
> 
>>this interface incompatible with lists. I would expect maximum 
>>compatibility to be able to exchange the types. Even compatibility 
>>between dictionnary and list is usefull ... because at some point you
>>can consider a dictionnary as a list of tuples. And then, too bad you
>>cannot because of simple design decisions.
> 
> 
> You make claims, above, about the container templates in C++'s standard
> library, that leave me in doubt.  The very fact that you make strong
> assertions without even hedging them in any way appears to indicate you
> are experienced with those templates and familiar with them -- yet what
> you say is false, appearing to indicate you aren't.  So, I don't know if
> I can take for granted all the rich conceptual framework of generic
> programming, and just point to the fact that the interface differences
> between Python's sequences, mappings and sets reflect exactly their
> conceptual semantic differences -- but then, that should be prima facie
> obvious!; or, do I need to try to forget the fact that you ever
> mentioned C++, and try to teach and explain both the conceptual
> underpinnings and the practical implications of generic programming (or
> more generally signature-based polymorphism) in purely Pythonic terms.

So I hope I made my point clearer this time ^_^ I used Python for a bit 
less than 2 years now (so that's not much) and C++ for 5 years with big 
projects in C++ (so that's more). So I'm no reusability specialist, but 
I used a lot the Boost library, where I learned a lot about reusability 
in C++. What I read about Python reusability was based on 
signature-based polymorphism as you call it. But then, it fails with 
lists, sets and dictionnary. And I don't really see any other way to do 
that (but to add methods to the sets ... and I don't want to do so).

> 
> You can probably clarify your background, and therefore which approach
> would help you most, more effectively, than I can just "stab in the
> dark" at it.  Do you know generic programming, and the offerings in
> C++'s standard library, solidly and in-depth?  If so, do you feel that
> C++'s standard library offers you the right amount of polymorphism among
> vectors, sets and maps, while you perceive a lesser amount among
> Python's lists, sets and dicts, and in what aspects precisely?  Whether
> this is the case, or not, can you present a simplified problem where you
> feel there isn't enough polymorphism for your purposes?  I may perhaps
> be able to offer help in any case, both practical and theoretical, but
> there is such a huge gulf among the various possibilities here that I
> think it's probably better for you to explain exactly where you come
> from and what you're after, first -- a theoretical background to help
> you understand the rationale behind the choices, a practical approach to
> use the best solutions for specific problems, etc, etc...!

Ok, so how do you write an algorithm on a container without specifying 
which, knowing that you cannot use the iterators, neither the signature 
of the class, nor the derivation ?

That was all the methods I imagined for Python's genericity. But none of 
them work when you see the base containers.

> 
> 
> 
>>No, what I want is not a new container type. I just want to be able to
>>change the container I use without changing all my codes ! It's too 
> 
> 
> It depends on what operations you're doing on that container, of course.
> 
> 
>>expensive. At the end, I finish using sets and hashsets from the STL 
>>exported to Python as sequences. The efficiency is the same and it just
>>works great :) I just feel bad not using the "standard" sets class ...
> 
> 
> But -- there IS no operator[] in the C++ standard library's sets!  And
> hashset (spelled with an underscore, I believe) is indeed a SGI
> extension (so, in STL stricto sensu), but it's not in the C++ standard,
> and AFAIK doesn't support indexing either...

As I said higher in this mail, that's true, set and hashsets does not 
support operator[], but you can use the "advance" function to emulate 
it. It inefficient with those, but when I have no other choices using 
Python technics ... At least those parts are efficients with lists and 
as inefficient as enything else with sets.

> 
> 
> Alex

Pierre, who hopes to be clearer that time :)



More information about the Python-list mailing list