get most common number in a list with tolerance

Fri Feb 20 05:15:42 EST 2009

Astan Chee wrote:
> Hi,
> I have a list that has a bunch of numbers in it and I want to get the 
> most common number that appears in the list. This is trivial because i 
> can do a max on each unique number. What I want to do is to have a 
> tolerance to say that each number is not quite unique and if the 
> difference from other numbers is small, than it can be counted together. 
> This is what I have to get the common number (taken from the internet 
> somewhere):
> 
> l = [10,30,20,20,11,12]
> d = {}
> tolerance = 5
> for elm in l:
>    d[elm] = d.get(elm, 0) + 1
> counts = [(j,i) for i,j in d.items()]
> 
> 
> 
> This of course returns a list where each of them is unique
> 
> [(1, 12), (1, 10), (1, 11), (2, 20), (1, 30)]
> 
> but I am expecting a list that looks like this:
> 
> [(3, 10), (2, 20), (1, 30)]
> 

Maybe check for this number: tolerance * ( X / tolerance)

or a variation if tolerance is non-integer?

Eg. (rough):

from itertools import groupby

def tmax(seq, alpha):
     longest = []
     for k, g in groupby(sorted(seq), lambda x: alpha * (x / alpha)):
         g = list(g)
         if len(g) > len(longest):
             longest = g
     return longest[0]

a = [10, 30, 20, 20, 11, 12]

assert tmax(a, 5) == 10
assert tmax(a, 1) == 20