complaints about no replies last week

Arnaud Delobelle arnodel at googlemail.com
Mon Mar 30 16:38:03 EDT 2009


pruebauno at latinmail.com writes:
[...]
> I myself asked about how to write a library to efficiently do union
> and intersection of sets containing time intervals some time ago on
> this list and got little to no answers. It is a tricky problem. Since
> I was getting paid I got an O(n*n) solution working. People on this
> list on the other hand do not get paid and answer whatever strikes
> their fancy. Sometimes the question is hard or confusing and nobody is
> motivated enough to answer.

I wasn't around when you posted this I guess. Do you mean intervals sets
on the (real) number line such as:
      
      1-3, 6-10 meaning all numbers between 1 and 3 and all numbers
      between 6 and 10.

In this case I think you can achieve union and intersection in O(nlogn)
where n is the total number of intervals in the interval sets to unify
or intersect. There is an implementation below. I have chosen a very
simple data structure for interval sets: an interval set is the list of
its endpoints. E.g.

    1-3, 6-10 is the list [1, 3, 6, 10]

This means that I can't specify whether an interval is closed or open.
So in the implementation below all intervals are assumed to be open.
The method could be made to work for any kind of intervals with the same
complexity, there would just be a few more LOC.  I'm focusing on the
principle - here it is:


--------------------------------------------------
# Implementation of union and intersection of interval sets.

from itertools import *

def combine(threshold, intsets):
    endpoints = sorted(chain(*imap(izip, intsets, repeat(cycle([1,-1])))))
    height = 0
    compound = []
    for x, step in endpoints:
        old_height = height
        height += step
        if max(height, old_height) == threshold:
            compound.append(x)
    return compound

def union(*intsets):
    return combine(1, intsets)

def intersection(*intsets):
    return combine(len(intsets), intsets)

# tests

def pretty(a):
    a = iter(a)
    return ', '.join("%s-%s" % (a, b) for a, b in izip(a, a))

tests = [
    ([1, 5, 10, 15], [3, 11, 13, 20]),
    ([2, 4, 6, 8], [4, 7, 10, 11]),
    ([0, 11], [5, 10, 15, 25], [7, 12, 13, 15]),
    ]

for intsets in tests:
    print "sets: ", "; ".join(imap(pretty, intsets))
    print "union: ", pretty(union(*intsets))
    print "intersection: ", pretty(intersection(*intsets))
    print "-"*20
--------------------------------------------------

Is this what you were looking for?

-- 
Arnaud



More information about the Python-list mailing list