Merging overlapping spans/ranges

Bengt Richter bokr at oz.net
Wed May 11 14:50:13 EDT 2005


On Tue, 10 May 2005 23:53:43 GMT, Jim Sizelove <sizelji at insightbb.com> wrote:
>Bengt Richter wrote:
[...]
>> Maybe (not tested beyond what you see ;-)
I had that feeling ... somehow the word "max" was trying to get my attention,
but I posted without figuring out why ;-/
>
>It is with some trepidation that I write this message; after hanging 
>around in c.l.p for the better part of a year, I have come to expect 
>Bengt's messages to be right-on and error-free.
Would that 'twere so ;-)

>
>However this time...  :)
>>  >>> def mergespans(spans):
>>  ...      start = end = None
>>  ...      for s,e in sorted(spans):
>>  ...          if start is None: start, end = s,e; continue
>>  ...          if s <= end: end = e; continue
>>  ...          yield start, end
>>  ...          start,end = s,e
>>  ...      if start is not None: yield start, end
>>  ...
>>  >>> spans = [(0,3), (4,7), (2,5), (9,14), (12,17)]
>>  >>> list(mergespans(spans))
>>  [(0, 7), (9, 17)]
>
>There can be a problem in mergespans if one span fits inside another:
>
Good call. I should have thought of it.

>  >>> spans = [(0,5), (4,7), (2,3), (9,14), (12,17)]
>  >>> list(mergespans(spans))
>  [(0, 3), (4, 7), (9, 17)]
>
>Here is a revised version (not tested beyond what you see ;-)
>  >>> def mergespans(spans):
>  ...     start = end = None
>  ...     for s,e in sorted(spans):
>  ...         if start is None:
>  ...             start, end = s,e
>  ...             continue
>  ...         if s <= end:
>  ...             if end < e:
>  ...                 end = e
>  ...             continue
>  ...         yield start, end
>  ...         start,end = s,e
>  ...     if start is not None:
>  ...         yield start, end
>  ...
>  >>> list(mergespans([(0,5), (4,7), (2,3), (9,14), (12,17)]))
>  [(0, 7), (9, 17)]
>  >>> list(mergespans([(0,3), (4,7), (2,5), (9,14), (12,17)]))
>  [(0, 7), (9, 17)]
>
>Can anyone find any other errors in this? ;-)
>
Not off hand. OTOH, I really didn't like the aesthetics of all that crufty logic in my original.
Maybe (again, barely tested ;-)

 >>> def mergespans(spans):
 ...     it = iter(sorted(spans))
 ...     start, end = it.next()
 ...     for s,e in it:
 ...         if s <= end:
 ...             end = max(end, e) # was what "max" was trying to tell me, I think
 ...         else:
 ...             yield start, end
 ...             start,end = s,e
 ...     yield start, end
 ...
 >>> list(mergespans([(0,5), (4,7), (2,3), (9,14), (12,17)]))
 [(0, 7), (9, 17)]
 >>> list(mergespans([(0,3), (4,7), (2,5), (9,14), (12,17)]))
 [(0, 7), (9, 17)]


>With humble regards,
Me too. UIAM we're all still human here (even my favorite 'bots, I think  ;-)

Regards,
Bengt Richter



More information about the Python-list mailing list