How to identify which numbers in a list are within each others' range

Karthik Gurusamy kar1107 at gmail.com
Fri Feb 1 15:17:39 EST 2008


On Jan 31, 8:12 am, erikcw <erikwickst... at gmail.com> wrote:
> Hi,
>
> I have a list of numbers each with a +/- margin of error.  I need to
> identify which ones overlab each other.
>
> For example:
> 55 +/- 3
> 20 +/- 2
> 17 +/- 4
> 60 +/- 3
>
> #base, max, min
> list = [
> (55, 58, 52),
> (20, 22, 18),
> (17, 21, 13),
> (60, 63, 57),
> ]
>
> In this example the range of list[0] overlaps the range of list[3] AND
> list[1] overlaps list[2]
>
> What is the best way to in python to identify the list items that
> overlap and the items that don't overlap with any other.

Note you just need the left-end and right-end of each interval. Mean
is redundant information. Once you sort the interval, you can just go
from left to right, retaining only necessary information.

Below method is O(n log n) + O (nk) where k is the average overlaps
per interval.
On most average case, first term dominates and hence its O(n log n);
worst case is ofcourse O(n^2) (a simple case is all n intervals
overlapping with each other)


def strip_sort (a, b):
    if a[0] < b[0]:
        return -1
    if a[0] > b[0]:
        return 1
    # To allow overlaps on a point, return L first then R
    # If overlap on a point must not be allowed, return 1 below
    if a[0] == 'L': return -1
    return 0

def overlaps (strips_given):
    # split each interval into two items. basically decorate with 'L'
for left-end of the interval, 'R' for right end of the interval
    s2 = [((s[0], 'L', i) , (s[1], 'R', i)) for i,s in
enumerate(strips_given)]
    strips = []
    for s in s2:  # flatten out the tuples
        strips.append(s[0])
        strips.append(s[1])

    clique = set() # set of nodes where each overlap with everyone
else in the set
    edges = []  # we are constructing a graph on N nodes where edge
i,j implies i and j overlap
    # below is O(N log N) where is N is number of intervals
    strips.sort(cmp=strip_sort)
    for s in strips:
        node = s[2]
        if s[1] == 'L':
            clique.add(node)
        if s[1] == 'R':
            # below is O(k) where k is clique size (overlaps per
interval)
            new_edges = [(node, i) for i in clique if i != node]
            edges += new_edges
            clique.remove(node)
    return edges

def main():
    lst = [(52, 58), (18, 22), (13, 21), (57, 63)]
    print overlaps(lst)

Output:
[(2, 1), (0, 3)]

Karthik

>
> Thanks!
> Erik




More information about the Python-list mailing list