set, dict and other structures

bearophileHUGS at lycos.com bearophileHUGS at lycos.com
Mon Jan 31 18:39:22 EST 2005


I'm frequently using Py2.4 sets, I find them quite useful, and I like
them, even if they seem a little slower than dicts. Sets also need the
same memory of dicts (can they be made to use less memory, not storing
values? Maybe this requires too much code rewriting).
I presume such sets are like this because they are kind of dicts. If
this is true, then converting a dict to a set (that means converting
just the keys; this is often useful for a successive processing with
set operators like intersection, etc) can be a very quick operation;
but this little code shows me that this isn't true, set-ify a dict is
even slower than doing it on a list. Can this operation made faster (by
people that writes Python interpreter)?

.from time import clock
.for p in xrange(16, 20):
.    n = 2**p
.    l = range(n)
.    t = clock()
.    set(l)
.    print n, round(clock()-t,3),
.    d = {}.fromkeys(xrange(n))
.    t = clock()
.    set(d)
.    print round(clock()-t,3)

==>
65536 0.082 0.1
131072 0.205 0.218
262144 0.45 0.562
524288 1.014 1.029

----------------------

Dicts and sets are two different data structures, but they share some
similarities. So instead of always converting dicts to sets, maybe some
fast set-like operations/methods can be added to dicts, like
intersection, mass removal, etc. (The result of *some* of such
operations between two dicts can be a regular set, and not a dict. This
is maybe not very "clear" from an formal/theoretical point of view).

-------------

I've studied the nice sets python module from Py2.3, and I am...
well... writing something similar, a (python) graph module for sparse
graphs. I've found 3 modules like that around on the Net, but I think
they can be improved (even if I'm not an expert of Python, and I know,
my code is probably quite rough/naive compared to "sets" module one...)
Graphs are complex data structures, so probably one person needs from a
graph data structure are different from other people needs, so it's
probably not easy to define a "standard" graph module fit for everyone.
Still, I think it can be useful to have a standard module to manage
them.
Do you think it be useful to add a (well made) standard module like
that?

Bear hugs,
Bearophile




More information about the Python-list mailing list