Merging overlapping spans/ranges

Jordan Rastrick jrastrick at student.usyd.edu.au
Tue May 10 13:31:40 EDT 2005


Max M wrote:
> I am writing a "find-free-time" function for a calendar. There are a
lot
> of time spans with start end times, some overlapping, some not.
>
> To find the free time spans, I first need to convert the events into
a
> list of non overlapping time spans "meta-spans".
>
> This nice ascii graph should show what I mean.
>
> 1) ---
> 2)     ---
> 3)   ---
> 4)          -----
> 5)             -----
>
>  >> -------  -------- # meta spans
>
> I can then iterate through the meta-spans and find non-busy times.
>
> I have written the class below, but it is rather O^2, so I wondered
if
> anybody has an idea for a better approach?
>
>
> ######################################
> # -*- coding: latin-1 -*-
>
> """
> 1) ---
> 2)     ---
> 3)   ---
> 4)          -----
> 5)             -----
>
>  >> -------  --------
> """
>
> class MetaSpans:
>
>      """
>      Populate with a list of span tuples [(start,end)], and it will
make
> "meta"
>      spans, with overlapping spans folded into one span.
>      """
>
>      def __init__(self):
>          self.spans = []
>
>      def add(self, span):
>          start, end = span
>          overlapping = [span]
>          non_overlapping = []
>          for spn in self.spans:
>              spn_start, spn_end = spn
>              # span rules for iterated spans
>              starts_before = spn_start <= start
>              ends_after = spn_end >= end
>              is_outside = starts_before and ends_after
>              starts_inside = start <= spn_start <= end
>              ends_inside =  start <= spn_end <= end
>              overlaps = starts_inside or ends_inside or is_outside
>              if overlaps:
>                  overlapping.append(spn)
>              else:
>                  non_overlapping.append(spn)
>          # overlapping spans are changed to one span
>          starts = []
>          ends = []
>          for start, end in overlapping:
>              starts.append(start)
>              ends.append(end)
>          min_start = min(starts)
>          max_end = max(ends)
>          non_overlapping.append( (min_start, max_end) )
>          self.spans = non_overlapping
>
>
>      def findFreeTime(self, duration):
>          self.spans.sort()
>
>
>
>
> if __name__ == '__main__':
>
>      ms = MetaSpans()
>      ms.add((0,3))
>      ms.add((4,7))
>      ms.add((2,5))
>      ms.add((9,14))
>      ms.add((12,17))
>      print ms.spans
>
>
>      from datetime import datetime
>      ms = MetaSpans()
>      ms.add((datetime(2005, 1, 1, 0, 0, 0), datetime(2005, 1, 1, 3,
0, 0)))
>      ms.add((datetime(2005, 1, 1, 4, 0, 0), datetime(2005, 1, 1, 7,
0, 0)))
>      ms.add((datetime(2005, 1, 1, 2, 0, 0), datetime(2005, 1, 1, 5,
0, 0)))
>      ms.add((datetime(2005, 1, 1, 9, 0, 0), datetime(2005, 1, 1, 14,
0, 0)))
>      ms.add((datetime(2005, 1, 1, 12, 0, 0), datetime(2005, 1, 1, 17,
0,
> 0)))
>      print ms.spans
>
>
>

I think the following code does what you want. It should be O(n log n)
- at least I hope thats what Python takes to sort the list of spans :)
Of course I've assumed you have the spans available to you all at once
as a list, and dont need to add them one at a time as you did in your
original code.

def get_metaspans(spans):
    """Given a list of span tuples [(start,end)], will generate all
    meta spans, with overlapping spans folded into one span.
    """
    spans.sort()
    spans = iter(spans)
    metaspan = spans.next()
    for span in spans:
        start, end = span
        m_start, m_end = metaspan
        if start > m_end:
            yield metaspan
            metaspan = span
        elif end > m_end:
            metaspan = (m_start, end)
    # Need to yield the final metaspan once the span list is exhausted
    yield metaspan

def get_breaks(metaspans):
    """Gets all the breaks in between a sequence of metaspans"""
    metaspans = iter(metaspans)
    _, prev_end = metaspans.next()
    for metaspan in metaspans:
        start, end = metaspan
        yield (prev_end, start)
        prev_end = end

I must admit I'm a bit of a generatoraholic, I'll tend to throw yields
at anything given half a chance :) Having to yield once more at the end
of the get_metaspans loop seems a little inelegant, maybe it could be
done a bit better. But its nice the way empty lists are handled
gracefully - the StopIteration thrown by the .next() calls are just
absorbed by the, uh, generatorness.

A little bit of testing:

>>> spans = [(12, 13), (0,3), (2,5), (1,4), (4,6), (1,2), (8,9), (9,
10)]
>>> print list(get_metaspans(spans))
[(0, 6), (8, 10), (12, 13)]
>>> print list(get_breaks(get_metaspans(spans)))
[(6, 8), (10, 12)]

Is that more or less what was required?




More information about the Python-list mailing list