Algorithm for Creating Supersets of Smaller Sets Based on Common Elements

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sat Feb 21 23:02:29 EST 2015


TommyVee wrote:

> Start off with sets of elements as follows:
> 
> 1. A,B,E,F
> 2. G,H,L,P,Q
> 3. C,D,E,F
> 4. E,X,Z
> 5. L,M,R
> 6. O,M,Y
> 
> Note that sets 1, 3 and 4 all have the element 'E' in common, therefore
> they are "related" and form the following superset:
> 
> A,B,C,D,E,F,X,Z

Sounds to me like you already have an algorithm in mind. It might not be
correct or complete, or it might not be efficient, but you performed some
steps in your own mind to generate that superset. Write down those steps
and you have an algorithm. Once you have the algorithm, you can then turn
it into Python code and start revising it from there.

I went back to first principles and drew a set diagram with multiple
disjoint sets of overlapping sets, and then thought about the process of
*adding* each new set to the set-of-supersets. It's hard to describe, but
if you start drawing an empty Venn diagram, then add one of your sets to
the Venn diagram at a time, merging that set to whatever superset it
belongs to, the process should appear quite obvious.

Let me walk through the process with your six sets above, plus one extra.
First, I create a list of supersets, initially empty:

supersets = []

I look at Set #1, and compare to all the supersets. There aren't any, so I
make a copy of Set #1 and add it to the supersets:

[{A,B,E,F}]

I look at Set #2, and compare it to all the supersets. It doesn't intersect
any of them, so I add it as a new superset:

[{A,B,E,F}, {G,H,L,P,Q}]

I look at Set #3. It intersects the first superset, but not the second, so I
merge them. Likewise for Set #4:

[{A,B,C,D,E,F}, {G,H,L,P,Q}]
[{A,B,C,D,E,F,X,Z}, {G,H,L,P,Q}]

Set #5 intersects the second superset, but not the first. Same for Set #6:

[{A,B,C,D,E,F,X,Z}, {G,H,L,M,P,Q,R}]
[{A,B,C,D,E,F,X,Z}, {G,H,L,M,O,P,Q,R,Y}]

Now let me add an extra Set #7, {A, R}, which intersects *both* supersets.
First I merge it with both:

[{A,B,C,D,E,F,R,X,Z}, {A,G,H,L,M,O,P,Q,R,Y}]

but I don't stop there. Then I recursively run through the supersets with
the same merging process to get:

[{A,B,C,D,E,F,G,H,L,M,O,P,Q,R,X,Y,Z}]

which is the single minimal, disjoint superset of all seven sets.



def merge(sets):
    """Merge a collection of sets into a list of minimal, disjoint
    supersets.
    """
    supersets = []
    for a in sets:
        count = 0
        for s in supersets:
            if a&s:
                s.update(a)
                count += 1
        if count == 0:
            supersets.append(a.copy())
        elif count > 1:
            supersets = merge(supersets)
    return supersets


Testing this with your six sets gives:

py> sets = [set(s) for s in ("ABEF", "GHLPQ", "CDEF", "EXZ", "LMR", "OMY")]
py> for s in merge(sets):
...     print(','.join(sorted(s)))
...
A,B,C,D,E,F,X,Z
G,H,L,M,O,P,Q,R,Y



Adding set #7 gives:

py> for s in merge(sets + [{'A', 'R'}]):
...     print(','.join(sorted(s)))
...
A,B,C,D,E,F,G,H,L,M,O,P,Q,R,X,Y,Z





-- 
Steven




More information about the Python-list mailing list