fast pythonic algorithm question

bryanjugglercryptographer at yahoo.com bryanjugglercryptographer at yahoo.com
Tue Aug 1 12:15:32 EDT 2006


Guyon Morée wrote:
> i have a big list of tuples like this:
>
> [ (host, port, protocol, startime, endtime), .. ] etc
>
> now i have another big(ger) list of tuples like this:
>
> [(src_host, src_port, dest_src, dest_port, protocol, time), ... ] etc
>
> now i need to find all the items in the second list where either
> src_host/src_port or dest_host/dest_port matches, protocol matches and
> time is between starttime and end time.
>
> After trynig some stuff out i actually found dictionary lookup pretty
> fast. Putting the first list in a dict like this:
>
> dict[(host,port,protocol)] = (starttime, endtime)

That only works if each (host,port,protocol) can appear with only
one (starttime, endtime) in your first big list. Do the variable
names mean what they look like? There's nothing unusual about
connecting to the same host and port with the same protocol, at
multiple times.

You might want your dict to associate (host,port,protocol) with a
list, or a set, of tuples of the form (starttime, endtime). If the
lists can be long, there are fancier methods for keeping the set
of intervals and searching them for contained times or overlapping
intervals. Google up "interval tree" for more.


-- 
--Bryan




More information about the Python-list mailing list