set, dict and other structures

Raymond Hettinger vze4rx4y at verizon.net
Mon Jan 31 19:43:21 EST 2005


<bearophileHUGS at lycos.com>
> I'm frequently using Py2.4 sets, I find them quite useful, and I like
> them, even if they seem a little slower than dicts.

Glad to hear that you find them to be a useful abstraction.

Since sets are internally implemented as dicts, they should have almost
identical performance (net of a small difference for the time to forward the
calls).


> Sets also need the
> same memory of dicts (can they be made to use less memory, not storing
> values? Maybe this requires too much code rewriting).

Yes, an alternate implementation could be written that would take less memory
(by about 1/3) and be a bit faster.  The approach would be to copy relevant
sections of the dictionary implementation and then remove anything pertaining to
dict values.  If the need arises and I find the time, this will probably happen
someday.

In the meantime, the current implementation is rock solid and gives great
performance.  Have you encountered situations where a 1/3 memory savings and a
slight speed improvement would have made a critical difference in a real
application?



> I presume such sets are like this because they are kind of dicts. If
> this is true, then converting a dict to a set (that means converting
> just the keys; this is often useful for a successive processing with
> set operators like intersection, etc) can be a very quick operation;
> but this little code shows me that this isn't true, set-ify a dict is
> even slower than doing it on a list. Can this operation made faster (by
> people that writes Python interpreter)?

Dicts take longer than lists because dict iteration takes longer than for lists.

If set-ifying a dictionary turns out to be an essential and time-critical
operation, it is possible to add some special case code for this.  The question
is whether it is worth it.   If you find the need is pressing, file a feature
request explaining why it is essential.  Then, I'll implement it for you.  On
the other hand, if this is a theoretical nice-to-have, then why clutter up the
code (with pre-sizing and forcing all values to True).



> I've studied the nice sets python module from Py2.3, and I am...
> well... writing something similar, a (python) graph module for sparse
> graphs. I've found 3 modules like that around on the Net, but I think
> they can be improved (even if I'm not an expert of Python, and I know,
> my code is probably quite rough/naive compared to "sets" module one...)
> Graphs are complex data structures, so probably one person needs from a
> graph data structure are different from other people needs, so it's
> probably not easy to define a "standard" graph module fit for everyone.
> Still, I think it can be useful to have a standard module to manage
> them.
> Do you think it be useful to add a (well made) standard module like
> that?

Start by posting a module outside the standard library; gathering user feedback;
and, if popular, propose it for inclusion in the standard library.

My guess is that there will be two issues.  One is that no one implementation of
graphs seems to satisfy all users.  The second is that only a small fraction of
Python users need for graph support (there is probably a much greater demand for
combinatorics, numeric/numarray, or financial modules).  If the appeal is not
broad, it has little chance of making it into the standard library.


Raymond Hettinger





More information about the Python-list mailing list