[Tutor] Efficiency and speed

Fri Mar 19 19:47:45 CET 2010

Well, I'm always out to impress!

This is a monte-carlo simulation.

The simulation measures the expiration of something and those somethings
fall into bins that are not evenly dispersed. These bins are stored in the
nx list mentioned previously.

So let's say you have the bins, a, b,c,d,e,f and you have the value z from
the sample list where z >b and <= a. In this case, it should return the
index value at position (a).

These, in turn, are used to take slices of other lists and only apply those
slices from position zero to position (a), in the case above.

I previously tried doing something like: if z >b and <= a then append to new
list, else pass, but that seemed like a slower way of going about it,
because python needs to cycle through the entire list to find the right
value. I guess i could add a count to the list, like len(appendedlist) and
using a while loop, so it will stop checking once it's been appended and
move to the next iteration in the sample list.

But before you recommend a better way of writing it, I would greatly
appreciate to know why the above is inefficient and some other way is more
efficient. I'm trying to learn the concepts behind programming in general
and how to program in python at the same time.

On Fri, Mar 19, 2010 at 1:56 PM, Stefan Behnel <stefan_ml at behnel.de> wrote:

> James Reynolds, 19.03.2010 17:41:
>
>  I've still been working towards learning the language, albeit slowly and
>> I've been working on a project that is somewhat intense on the numerical
>> calculation end of things.
>>
>> Running 10,000 trials takes about 1.5 seconds and running 100,000 trials
>> takes 11 seconds. Running a million trials takes about 81 seconds or so. I
>> don't think 1M trials is needed for accuracy, but 100K probably is.
>>
>> I've made a few other optimizations today that I won't be able to test
>> until
>> I get home, but I was wondering if any of you could give some general
>> pointers on how to make python run a little more quickly.
>>
>> There are two major pieces of code that seem slow, so I'll share the first
>> with you. This section takes up about 1/3 of the time used when running
>> all
>> trials, where trials is 10K or larger.
>>
>> This section doesn't actually do any math, all it's doing is returning the
>> "bin" that the randomly generated number falls in.
>>
>> The second section, the one that is taking up most of the time, does the
>> math.
>>
>> The list nx1 is about 220 floating point numbers long.
>>
>>
>>
>>  sample = random.sample(range(int(self.nx1[b])), trials) # a list of
>>> sample
>>> values based on nx1
>>>
>>
>> countlist = []
>>
>> self.nx1.append(0) #puts a zero on the end of nx1. This is for the case
>>
>>> where one of the random values lies past the minimum number (it scales
>>> from
>>> 10M down to almost 0, but not exactly 0) Antyhing past this dates yields
>>> a
>>> 0.
>>>
>>
>> for s in self.mcrange_gen(sample):
>>
>>          countlist.append(s-1) # This appends the bin number (the number
>>
>>> returned previously minus one) to make slices of the premium streams.
>>>
>>
>>
>> and here is the generator section:
>>
>> def mcrange_gen(self, sample):#, lensample):
>>
>>     lensample = len(sample) # this section is just for speed. All of these
>>
>>> are renames from the globals to bring calc times down at the expense of
>>> memory. I haven't tested these yet.
>>>
>>
>>     nx2 = self.nx1
>>
>>     nx2_append = nx2.append
>>
>>     nx2_sort = nx2.sort
>>
>>     nx2_reverse = nx2.reverse
>>
>>     nx2_index = nx2.index
>>
>>     nx2_remove = nx2.remove
>>
>>     for s in range(lensample):
>>
>>         q = sample[s] #takes the next randomly generated number from the
>>
>>> sample list
>>>
>>
>>         nx2_append(q) # and appends it to nx list.
>>
>>         nx2_sort() # and sorts it in place
>>
>>         nx2_reverse() # reverses the list, because this was the original
>>
>>> order
>>>
>>
>>         i = nx2_index(q) #get the index of that element
>>
>>         nx2_remove(q) # and remove the element.
>>
>>         yield i # send the position of that element back to the main
>>
>>> program.
>>>
>>
> This certainly falls into the bin of the most inefficient algorithms I've
> ever seen. Even walking through the samples one by one to find the target
> bin would be faster than the above.
>
> Could you try to describe in a couple of words what this algorithm is
> supposed to do? That will almost certainly make it clear how you should
> write it instead.
>
> Stefan
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20100319/20127839/attachment-0001.html>