Merging overlapping spans/ranges

Steven Bethard steven.bethard at gmail.com
Tue May 10 17:05:46 EDT 2005


elbertlev at hotmail.com wrote:
> The linear method:
> 
> You create an array - one bool per minute. For one day 24 * 60 entries
> is enough. Spans (Start, End) are in minutes from midnight. Set array
> slots in range(Start, End) to True for each input span.
> 
> Scan the array and find metaspans - contiguous sequences of False.

Yeah, this would basically have been my suggestion.  One implementation:

py> def merge(spans, nbins):
...     bins = [0]*nbins
...     for start, end in spans:
...         for bin in xrange(start, end):
...             bins[bin] += 1
...     def key((i, count)):
...         return count != 0
...     index_groups = itertools.groupby(enumerate(bins), key=key)
...     for isnonzero, indices in index_groups:
...         if isnonzero:
...             indices = [i for (i, count) in indices]
...             yield (indices[0], indices[-1])
...
py> spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
py> list(merge(spans, 20))
[(0, 6), (9, 16)]

Obviously, this needs some modification to handle datetime objects.

STeVe



More information about the Python-list mailing list