efficient intersection of lists with rounding

Adam DePrince adam at cognitcorp.com
Thu Dec 2 23:02:57 EST 2004


On Thu, 2004-12-02 at 22:16, Greg Ewing wrote:
> Gordon Williams wrote:
> >>>>a= [(123,1.3),(123,2.4),(123,7.8),(123,10.2)]
> >>>>b= [(123, 0.9), (123, 1.9), (123, 8.0)]
> >>>>[ (i,round(j)) for i,j in a for l,m in b if (i,round(j)) ==
> > (l,round(m))]
> 
> d = {}
> for (l, m) in b:
>    d[l, round(m)] = 1
> 
> result = []
> for (i, j) in a:
>    t = (i, round(j))
>    if t in d:
>      result.append(t)
> 
> > - in the future I will want to round the second number of closest 0.25
> > rather than whole number.
> 
> I would do that by multiplying by 4 and rounding to
> an integer to derive the dictionary key. That will
> avoid any float-representation problems you might have
> by trying to round to a fraction.
> 
> > Would the sets module be more efficient?
> 
> As another poster said, sets are implemented as dicts
> in 2.3, so it comes down to much the same thing. Using
> sets might be a bit faster than the above code in 2.4,
> but probably not greatly so. By far the biggest
> improvement will come from using an O(n) algorithm
> instead of an O(n**2) one.

Of course a low O-factor is important; you should avoid however
confusing the statement of what you want to do with the statement of how
you want to do it.  One of the benefits of a HLL like Python is you can
merely state *what* you want without worrying about how to compute it.  

In the original example above you are computing a set intersection -
python's set object has an intersection method.  Use it.  Not only is it
faster than your O**2 solution, but it is a good deal clearer.

>>> from sets import Set
>>> set_a = Set( [(i,round(j)) for i,j in a] )
>>> set_b = Set( [(i,round(j)) for i,j in b] )
>>> set_a.intersection( set_b )
Set([(123, 2.0), (123, 1.0), (123, 8.0)])

Or you could say ... 

>>> set_a, set_b = [[Set((i,round(j))) for i,j in s] for s in (a,b )]



Adam DePrince 





More information about the Python-list mailing list