[Pythonmac-SIG] Fast way to get a list of unique values?
Robb Brown
brownr@ucalgary.ca
Tue, 10 Sep 2002 11:29:49 -0600
Thanks for all the replies!
Much as I'm tempted to use your first method.... :)
The array is a volume mask for medical data. The problem is that
there may be several masks in the same file (several different
segmented tissues). If I have a list of the unique values, I know
how many tissues are present and can divide them into separate
objects.
Because of the size of these volumes I suspect it's time to break out
the C compiler.
>Quoting Robb Brown <brownr@ucalgary.ca>:
>>
>> I have a large array (Numeric) of integers. I'd like to get a list
>> of all the unique values in that array. Naturally I'd like to avoid
>> a for loop.
>
>Here's one way. It does avoid using a for loop.
>I'm guessing that it isn't quite exactly what you wanted.
>Fun exercise, though. See below for a more serious suggestion.
>(This is *Python*, after all...)
>
>#!/usr/bin/env python
>#
># Robb has a "large array (Numeric) of integers. I'd like to get
># a list of all the unique values in that array. Naturally I'd like
># to avoid a for loop. Is there an easy way to do this in Python?
>#
># No mention of speed ...
>#
>
>from operator import __mul__
>
># --- from the cookbook, w/ slight mods --
># Sieve of Eratosthenes
># David Eppstein, UC Irvine, 28 Feb 2002
># From the ASPN Python cookbook ...
># http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/117119
>#
>from __future__ import generators
>
>def eratosthenes ():
> '''Yields the sequence of prime numbers via the Sieve of Eratosthenes.'''
> D = {} # map composite integers to primes witnessing their compositeness
> q = 2 # first integer to test for primality
> while 1:
> if q not in D:
> yield q # not marked composite, must be prime
> D[q*q] = [q] # first multiple of q not already marked
> else:
> map( lambda p: D.setdefault(p+q,[]).append(p), D[q] )
> del D[q] # no longer need D[q], free memory
> q += 1
>
>primes_iter = eratosthenes()
>primes_list = [ ]
>prim_d = { }
>
>def primes ( n ):
> while len(primes_list) <= n:
> nx = primes_iter.next()
> primes_list.append( nx )
> prim_d[ nx ] = len(primes_list)-1
> return primes_list[n]
>
>def uniq ( arr ):
> # map the array into a goedel number of sorts
> gm = min( arr )
> gx = max( arr )
>
> # offset the list to n in [ 0 .. max-min+1 ]
> # and then find the nth prime
> bias = map( lambda x: primes(x-gm), arr )
>
> # blast all those primes together
> goedel = reduce( __mul__, bias )
>
> # find the ones that factor into goedel
> fax = filter( lambda n: goedel%n == 0, primes_list )
>
> # print len(arr), arr
> # print bias
> # print goedel
> # print len(fax), fax
> return map( lambda z: prim_d[z] + gm, fax )
>
>if __name__ == "__main__":
> def test_e ():
> iter = eratosthenes()
> lst = []
> while 1:
> ff = iter.next()
> lst.append( ff )
> print ff
> if ff > 1000:
> break
> print lst[ -12: ]
>
> def test_p ():
> print primes( 12 )
>
> def test ():
> arr = [ -9, -2, -4, 0, 19, 2, 42, -9, -2, -4, 0, 19, 2, 421, 32, 42 ]
> print uniq( arr )
>
> test_e()
> test_p()
>
> test()
>
>
>> Is there an easy way to do this in Python?
>
>You do not tell us where you acquired that "large array".
>My suggestion, given the lack of any other detail, would be to see if
>it's possible to create that array as a dict instead (with the numbers
>as keys, as Bob mentioned). I.e., do the unique-ifying while building
>the array, instead of treating it as a separate problem.
>
>
>
>
>-=-=-=-
>
>food. shelter. clothing. net. Got.net - The Internet Connection, Inc
--
______________________________
Robb Brown
Seaman Family MR Research Centre
Calgary, Alberta, Canada