set, dict and other structures

bearophileHUGS at lycos.com bearophileHUGS at lycos.com
Wed Feb 2 07:54:32 EST 2005


Leif K-Brooks:

>They look exactly the same speed-wise to me:

There's a little overhead, you can see it with a bigger test:

.from time import clock
.n = 1*10**6
.
.t1 = clock()
.d = set()
.for i in xrange(n):
.  d.add(i)
.t2 = clock()
.for i in xrange(n):
.  d.remove(i)
.t3 = clock()
.d = set(xrange(n))
.t4 = clock()
.print round(t2-t1,2), round(t3-t2,2), round(t4-t3,2), round(t4-t1,2)
.
.t1 = clock()
.d = {}
.for i in xrange(n):
.  d[i] = None
.t2 = clock()
.for i in xrange(n):
.  del d[i]
.t3 = clock()
.d = {}.fromkeys(xrange(n))
.t4 = clock()
.print round(t2-t1,2), round(t3-t2,2), round(t4-t3,2), round(t4-t1,2)

----------------------

Jeremy Bowers:

>I don't see a success pattern for it; I'd say they foundered on trying
to shoot for the moon in advance, instead of starting to write code and
incrementally improving it.<

I agree, their ideas seem too much abstract and big (and the resulting
code looks sloow); I am aiming to something quite simpler, like the
"sets" python module of Py2.3 (but the after the python version
development & testing, a C version can be useful, just like the set
objects in Py2.4).
My (simple) graph module is already nearly done ^_^

>Trying to create a graph library that meets any significant percentage
of
the needs of the people who would use it is quite non-trivial.<

This is right. But:
1) I've found some implementations of graphs:
- One inside Gato: http://www.zpr.uni-koeln.de/~gato/About/
- http://pygraphlib.sourceforge.net/dist/
- http://pygraphlib.sourceforge.net/doc/public/pygraphlib-module.html
- http://www.ics.uci.edu/~eppstein/PADS/
Some modules from Julien Burdy can be found, etc. I think I can find 5
different graph python modules, this number tells me that there's some
people that need/want them.
2) A graph data structure written in C probably cannot be the right one
for every person, because there are so many kinds of graphs... but if
it's fast and general enough and it's already there (and simple
enough!), then most people will start using it, adjusting their needs
to it...

----------------------

Giovanni Bajo:

>In fact, for a moment I wondered if dict.keys() was already of type
set in 2.4, because it could have made sense.<

Well, I presume sometimes you need a list, so maybe something like
dict.keysset() can be a better idea.


>Of course, there are some nitpicks.<

Yes.


>For instance, in the last case the dictionary at the intersection
should probably hold a tuple of two values, one coming from each
dictionary (since the intersection finds the items whose key is present
in both dictionaries, but the value can obviously be different).<

There are other possibilities.


>Another solution would be to say that set-like operations, like (d1 &
d2),
return an iterator to a sequence of keys *only*, in this case the
sequence of
keys available in both dictionaries.<

Right ^_^

Then let's see something:

a = {}.fromkeys(range(10),0)
b = {}.fromkeys(range(5,13),1)

1) a - b ==> dict
Its meaning can be something like:
r = {}
for k in set(a).difference(b): r[k] = a[k]

For functional programming people, it can even mean something like:
r = {}
scan(curry(r.__setitem__(Blank, a[k])), set(a).difference(b))
(Unlike map, scan does not build up a new expression to return.)


2) a + b ==> dict
Its meaning:
r = a.copy()
r.update(b)


3) The intersection is a bit tricky. I don't like the idea of resulting
a dict with tuple values, with both the values of the shared keys. The
other idea is nicer:
a.intersection(b)
Its possible meaning:
3a) set(a).intersection(b)
Alternative meaning:
3b) list(set(a).intersection(b))
(Instead of adding this intersection to dicts, it can just be made a
quite faster dict=>set conversion to make set(a).intersection(b) fast.)


4) Another possibility is less common, but easy to implement, so this
is optional:
a.xor(b)
Its meaning:
r = {}
for x in a: if x not in b: r[x] = r[a]
for x in b: if x not in a: r[x] = r[b]

(Probably there are better ways to implement them, here I've just used
python as a kind of pseudocode).

Bear hugs,
Bearophile




More information about the Python-list mailing list