Smarter way of doing this?

Max M maxm at mxm.dk
Thu Feb 5 05:54:10 EST 2004


Anton Vredegoor wrote:

> Max M <maxm at mxm.dk> wrote:
> 
> 
>>I solved it by using acumulated probabilities instead. So the final 
>>version is here, if anybody cares.
> 
> Yes it's fast, but I don't think it's smart :-) Unfortunately I
> haven't got a working solution as an alternative, but only some code
> that "explains" what I mean:
> 
> 
>     def choice_generator(population,probabilities):
>         PC = zip(probabilities,population)
>         while 1:
>             p,c = max([(p*random(),c) for p,c in PC])
>             yield c
> 
> This is short and fast. However, the distribution of the outcomes is
> wrong, because the list of probabilities should be "adjusted" so that
> in the end the *outcomes* are distributed according to the
> "probabilities". Or should that be proportions?


I don't understand what you mean. If I calculate the deviations from 
what is expected, and use a large result set, I get very small deviations.

I am interrested in getting a result as a propertion of the probablilty, 
or more correctly in this case, the frequency. If I want it as 
probabilities I just have to normalise the sim to 1.0.

This has the advantage that the frequencies can be expressed as integers 
too. This is nice in my Markov chain class that count words in text, etc.

In my example below each letter should be occur 50% of the times of the 
previous letter.

Perhaps you mean that it should behave differently?

regards Max M



###################

probabilities = [16, 8, 4, 2, 1]
elements = ['a', 'b', 'c', 'd', 'e']
sample_size = 1000000

s = Selector(probabilities, elements)
r = s.get_range(sample_size)
r.sort()

previous = float(sample_size)
for element in elements:
     count = r.count(element)
     deviation = (previous/2.0-count) / count * 100
     previous = count
     print element, count, deviation


 >> a 517046 -3.29680531326
 >> b 257439 0.421070622555
 >> c 129159 -0.340278261677
 >> d 64148 0.672663216312
 >> e 32208 -0.416045702931



More information about the Python-list mailing list