An efficient, pythonic way to calculate result sets

mensanator at aol.com mensanator at aol.com
Thu Oct 25 14:44:53 EDT 2007


On Oct 25, 10:31 am, happyhon... at gmail.com wrote:
> Hello everyone,
>
> I've got a little issue, both programming and performance-wise. I have
> a set, containing objects that refer to other sets. For example, in a
> simple notation: (<a, b, c>, <d, e>) (or in a more object-like
> display: set(obj1.choices=set(a, b, c) ). There may be obj1..objN
> objects in the outer set, and the amount of items in the choices sets
> could also reach a high number. The objects in the choices sets might
> overlap.
>
> Now I want to have all possible combinations, like this: (a, d), (b,
> d), (c, d), (a, e), (b, e), (c, e).
>
> However, I've got a few catches that make an already (icky) recursive
> function worse to use.
>
> First of all, I want to be able to only choose things so that the
> outer 'result sets' have the same length. For example, if you'd have
> (<a, b>, <a, c>), you might pick (a, a) with a simple algorythm, the
> basic behaviour of sets reducing it to (a) and thus having an improper
> length. I could add yet another loop after calculating everything to
> throw out any result sets with the improper length, but that would be
> highly inefficient.

Does this do what you want?

result_set = set([])
seta = set(['a','b','c','d','e'])
setb = set(['a','c','f','g','h'])
for i in seta:
  temp1 = setb.difference(set([i]))
  for j in temp1:
    result_set.add(tuple(set([i,j])))
for i in result_set:
  print i

I figure there should be 4+5+3+5+5 results.
No ('a'), no ('c'). Has ('a','c') but not ('c','a').

##  ('c', 'g')
##  ('a', 'd')
##  ('h', 'e')
##  ('a', 'b')
##  ('c', 'f')
##  ('e', 'g')
##  ('c', 'b')
##  ('d', 'f')
##  ('a', 'g')
##  ('a', 'h')
##  ('c', 'e')
##  ('e', 'f')
##  ('d', 'g')
##  ('h', 'b')
##  ('a', 'f')
##  ('b', 'f')
##  ('c', 'd')
##  ('h', 'c')
##  ('a', 'c')
##  ('b', 'g')
##  ('a', 'e')
##  ('h', 'd')

>
> Second, I'd hope to be able to say that objX should be assumed to have
> made the choice z. In the first example I mentioned, if I said that
> 'obj1 == a', the only result sets that would come out would be (a, d)
> and (a, e).

Like this?

result_set = set([])
seta = set(['a','b','c','d','e'])
setb = set(['a','c','f','g','h'])
target = 'a'
for i in seta:
  temp1 = setb.difference(set([i]))
  for j in temp1:
    temp2 = set([i,j])
    if target in temp2:
      result_set.add(tuple(temp2))
for i in result_set:
  print i

##  ('a', 'f')
##  ('a', 'g')
##  ('a', 'd')
##  ('a', 'e')
##  ('a', 'h')
##  ('a', 'b')
##  ('a', 'c')


>
> I've been toying with this problem for a while, but I've found out it
> quickly gets slow, so I hope some people here could find a way to
> write code like this that is efficient (and hopefully not rely on
> recursion and 'fix up' loops like I've got mockups with right now).
>
> Thank you for any suggestions you can offer.





More information about the Python-list mailing list