Whrandom

Tue May 15 11:42:20 EDT 2001

"Duncan Smith" <buzzard at urubu.freeserve.co.uk> wrote in
news:9dpvgb$lfd$1 at news8.svr.pol.co.uk: 

> Now I am a statistician and not a computer scientist.  I have developed
> a habit of using dictionaries (as above) rather than lists because I am
> under the impression it is more efficient than checking the current
> list for a match on each iteration.  Could any computer scientists out
> there confirm this (or point me in a more efficient direction). 
> Cheers. 
> 

It is easy enough to check:
----- timing.py -----
import random
import time

def randomdict(lo, hi, count, randint=random.randint):
    """Return a list of 'count' random numbers
    from lo to hi, using a dictionary to check uniqueness"""
    dict = {}
    while len(dict) < count:
        dict[randint(lo, hi)] = None
    return dict.keys()

def randomlist(lo, hi, count, randint=random.randint):
    """Return a list of 'count' random numbers
    from lo to hi, using a list to check uniqueness"""
    nums = []
    append = nums.append
    while len(nums) < count:
        r = randint(lo, hi)
        if r not in nums:
           append(r)
    return nums

def timetest(function, lo, hi, count, loop):
    start = time.time()
    for i in range(loop):
        res = function(lo, hi, count)
    end = time.time()
    print "%d calls to %s(%d,%d,%d) took %.2f seconds, %.2fmS/call" % (
        loop, function.__name__, lo, hi, count,
        (end-start), ((end-start)*1000)/loop)

if __name__=='__main__':
    timetest(randomdict, 1, 1000, 5, 1000)
    timetest(randomlist, 1, 1000, 5, 1000)
    timetest(randomdict, 1, 10000, 1000, 100)
    timetest(randomlist, 1, 10000, 1000, 100)
----- end of timing.py ---
1000 calls to randomdict(1,1000,5) took 0.26 seconds, 0.26mS/call
1000 calls to randomlist(1,1000,5) took 0.28 seconds, 0.28mS/call
100 calls to randomdict(1,10000,1000) took 5.21 seconds, 52.07mS/call
100 calls to randomlist(1,10000,1000) took 27.47 seconds, 274.70mS/call

So it looks as though even when the list is quite small, the dictionary 
version still wins (just), but as the list grows the difference becomes 
much more significant.

-- 
Duncan Booth                                             duncan at rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?