Collaps arrays/ list of intergers

Peter Otten __peter__ at web.de
Tue Aug 19 13:52:11 EDT 2014


Jurgens de Bruin wrote:

> Hi All,
> 
> I do hope somebody can help me with the following:
> I have the followings lists which represent the upper and lower value of a
> range/array.
> 
> a = [1,50]
> b = [75,150]
> c = [25,42]
> d = [120,149]
> e = [35,55]
> 
> What I would like to happen is that overlapping range will "collapse" to a
> single range meaning the above list would become:
> 
> as list a,c and e overlap they can be represented by
> f = [1,55]
> as list b and d overlap they can be represented by
> g = [75,150]
> 
> I have sort of got working solution using networkx and numpy and they work
> perfect for the above example the problem I have no is my true data set
> looks more like the following:
> 
> x = [135098,19854736]
> y = [135098,41639553]
> z = [11818998,12587339]
> 
> To give only three examples using networkx and numpy does not work as this
> results in memory error due to the ranges being so large.
> 
> I would appreciate any help with this.

The naive approach would be to sort the intervals, take one and extend it 
while the lower bound is inside the current interval. Here is my attempt (no 
warranties):

$ cat merge_intervals.py         
def merge_intervals(intervals):
    intervals = iter(sorted(intervals))

    current_lo, current_hi = next(intervals)
    for lo, hi in intervals:
        if lo <= current_hi:
            if hi > current_hi:
                current_hi = hi
        else:
            yield [current_lo, current_hi]
            current_lo = lo
            current_hi = hi
    yield [current_lo, current_hi]


def demo(intervals):
    print("before")
    for ivl in intervals:
        print(ivl)
    print("")

    intervals = list(merge_intervals(intervals))
    print("after")
    for ivl in intervals:
        print(ivl)
    print("\n")

demo(
    [
        [1,50],
        [75,150],
        [25,42],
        [120,149],
        [35,55],
    ]
)

demo(
    [
        [135098,19854736],
        [135098,41639553],
        [11818998,12587339],
        [10**17, 10**20],
        [10**15, 10**18],
        [10**20+1, 10**30],
    ]
)
$ python3 merge_intervals.py 
before
[1, 50]
[75, 150]
[25, 42]
[120, 149]
[35, 55]

after
[1, 55]
[75, 150]


before
[135098, 19854736]
[135098, 41639553]
[11818998, 12587339]
[100000000000000000, 100000000000000000000]
[1000000000000000, 1000000000000000000]
[100000000000000000001, 1000000000000000000000000000000]

after
[135098, 41639553]
[1000000000000000, 100000000000000000000]
[100000000000000000001, 1000000000000000000000000000000]







More information about the Python-list mailing list