mapping subintervals
Matteo
mahall at ncsa.uiuc.edu
Wed Jun 13 17:08:12 EDT 2007
On Jun 13, 2:32 pm, Lee Sander <lesa... at gmail.com> wrote:
> hi,
> I have the following problem which is turning out to be non-trivial. I
> realize that this is not
> exactly a python problem but more of an algorithm problem -- but I
> post it here because
> I want to implement this in python.
>
> I want to write a code that given an interval (integer tuple:
> start,stop) will find which other
> interval it matches to. Here is an example,
> suppose I have the following NON-OVERLAPPING intervals
>
> 10 - 21 ==> 'a1'
> 34 - 51 ==> 'a2'
> 77 - 101 ==> 'a3'
> etc
>
> So, suppose I'm give an interval such as (42,50), I should go through
> the list and map it to "a2" etc
> if the region is a subset of an exiting interval. If the query
> interval does not exist in the table or
> maps to two or more intervals (eg, (45, 80)) the program return a
> None.
...snip...
> Many thanks
> Lee
OK - I'm going to assume your intervals are inclusive (i.e. 34-51
contains both 34 and 51).
If your intervals are all really all non-overlapping, one thing you
can try is to put all the endpoints in a single list, and sort it.
Then, you can use the bisect module to search for intervals, which
will give you a logarithmic time algorithm.
Here, I'm going to assume you just need the index of the containing
interval. If you really need a name (i.e. 'a1' or 'a2'), you can use a
list of names, and index into that.
I hope those assumptions are valid! if so, the following should work:
from bisect import bisect
# assume initial intervals are sorted
intervals=[(10,21),(34,51), (77,101)]
# get a sorted list of endpoints
endpts=sorted([a for a,b in intervals]+[b for a,b in intervals])
def test_interval(ivl,endpts):
# Find where the left endpoint of ivl would lie in the list
# i.e. the index of the first element greater than ivl[0]
l_idx=bisect(endpts,ivl[0])
# If l_idx is even, then it lies between two intervals, and thus
# is not contained in any interval. Otherwise it returns the index
if l_idx % 2 == 0: return None
# if l_idx is out of bounds (i.e. ==len(endpts)), then ivl is
# not contained in any interval (too large)
if l_idx==len(endpts): return None
# Now, all we need to do is check that the right endpoint of ivl is
# less than or equal to the corresponding endpoint in the list.
# Happily, that endpoint is at index l_idx
if ivl[1]<=endpts[l_idx]:
# So return the index of the interval:
return (l_idx-1)/2
else:
return None
Then, from a shell:
>>> print tst.test_interval((0,13),endpts)
None
>>> print tst.test_interval((15,21),endpts)
0
>>> print tst.test_interval((35,40),endpts)
1
>>> print tst.test_interval((40,80),endpts)
None
>>> print tst.test_interval((109,200),endpts)
None
More information about the Python-list
mailing list