mapping subintervals

Lee Sander lesande at gmail.com
Wed Jun 13 15:32:13 EDT 2007


hi,
I have the following problem which is turning out to be non-trivial. I
realize that this is not
exactly a python problem but more of an algorithm problem -- but I
post it here because
I want to implement this in python.

I want to write a code that given an interval (integer tuple:
start,stop) will find which other
interval it matches to. Here is an example,
suppose I have the following NON-OVERLAPPING intervals

10 - 21   ==> 'a1'
34 - 51   ==> 'a2'
77 - 101 ==> 'a3'
etc

So, suppose I'm give an interval such as (42,50), I should go through
the list and map it to "a2" etc
if the region is a subset of an exiting interval.  If the query
interval does not exist in the table or
maps to two or more intervals (eg, (45, 80)) the program return a
None.

One naive way to solve this problem is to create an array such as
follows:
[None, None, None, ...., None, a1, a1, a1, ..., a1, None, None ...,
None, a2, ... etc] at indicies
  1        2        3              9       10  11  12       21
22      23          33     34, ...

now with this in place I can easily solve the problem. However, this
is not a feasable solution
because the initial list has intervals whose range can go to billions!
So I need a smarter
idea. So what I can do is sort the initial list and then go through
each element from the start
and test if the  a > X[i][0] and b < X[i][1]
where (a,b) is my query start and stop
and X is a n x 2 array with the known intervals.
I don't like this solution because it can be really really slow as for
each query I'm doing a
linear search and on average I'll be searching almost half the list
before I find the right
interval.
Is there a smarter way to do this?
I've my problem is not clear please let me know and I'll try to
explain the unclear parts again.
Many thanks
Lee




More information about the Python-list mailing list