time consuming loops over lists

Peter Otten __peter__ at web.de
Wed Jun 8 05:18:18 EDT 2005


querypk at gmail.com wrote:

> X-No-Archive: yes
> Can some one help me improve this block of code...this jus converts the
> list of data into tokens based on the range it falls into...but it
> takes a long time.Can someone tell me what can i change to improve
> it...
> 
> def Tkz(tk,data):
>      no_of_bins = 10
>      tkns = []
>      dmax = max(data)+1
>      dmin = min(data)
>      rng = ceil(abs((dmax - dmin)/(no_of_bins*1.0)))
>      rngs = zeros(no_of_bins+1)
>      for i in xrange(no_of_bins+1):
>           rngs[i] = dmin + (rng*i)
>      for i in xrange(len(data)):
>           for j in xrange(len(rngs)-1):
>                if data[i] in xrange(rngs[j],rngs[j+1]):
>                     tkns.append( str(tk)+str(j) )
>      return tkns

On second thought, with bins of equal size you have an option that is even
faster than bisect:

def bins_calc(tk, data, no_of_bins=10):
    dmax = max(data) + 1
    dmin = min(data)
    delta = dmax - dmin
    rng = int(ceil((dmax - dmin)/no_of_bins))
    tokens = [tk + str(i) for i in xrange(no_of_bins)]
    return [tokens[(v-dmin)//rng] for v in data]

Peter




More information about the Python-list mailing list