fast pythonic algorithm question

Guyon Morée guyon.moree at gmail.com
Tue Aug 1 12:53:00 EDT 2006


Brian you are right, but in my case (host, port, protocol) is unique.


bryanjugglercryptographer at yahoo.com schreef:

> Guyon Morée wrote:
> > i have a big list of tuples like this:
> >
> > [ (host, port, protocol, startime, endtime), .. ] etc
> >
> > now i have another big(ger) list of tuples like this:
> >
> > [(src_host, src_port, dest_src, dest_port, protocol, time), ... ] etc
> >
> > now i need to find all the items in the second list where either
> > src_host/src_port or dest_host/dest_port matches, protocol matches and
> > time is between starttime and end time.
> >
> > After trynig some stuff out i actually found dictionary lookup pretty
> > fast. Putting the first list in a dict like this:
> >
> > dict[(host,port,protocol)] = (starttime, endtime)
>
> That only works if each (host,port,protocol) can appear with only
> one (starttime, endtime) in your first big list. Do the variable
> names mean what they look like? There's nothing unusual about
> connecting to the same host and port with the same protocol, at
> multiple times.
>
> You might want your dict to associate (host,port,protocol) with a
> list, or a set, of tuples of the form (starttime, endtime). If the
> lists can be long, there are fancier methods for keeping the set
> of intervals and searching them for contained times or overlapping
> intervals. Google up "interval tree" for more.
> 
> 
> -- 
> --Bryan




More information about the Python-list mailing list