Help with an 8th grade science project

Thu Nov 20 15:13:07 EST 2014

dave em <daveandem2000 at gmail.com> Wrote in message:
> Hello,
> 
> I am the adult advisor (aka father) to an 8th grader who is doing a science project that will compare / benchmark CPU performance between an AMD 64 Phenom II running Ubuntu 14.04 and a Raspberry Pi 700MHz ARM.
> 
> Basic procedure:
> -  Run a Python script on both computers that that stresses the CPU and measure
> --  Time to complete the calculation
> -- Max CPU during the calculation
> -- We have chosen to do factorials and compare performance by running calculations by order of magnitude.  Our hypothesis is that we will begin to see a wider performance gap between the two computers as the factorials increase in order of magnitude.
> 
> Status:
> -  We have a working program.  Pseudo code follows:
> 
> import linux_metrics
> from linux_metrics import cpu_stat
> import time
> 
> print 'Welcome to the stress test'
> number = raw_input("Enter the number to compute the factorial:")
> 
> ## function to calculate CPU usage percentage
> def CPU_Percent():
>     cpu_pcts = cpu_stat.cpu_percents(.25)
>     print 'cpu utilization: %.2f%%' % (100 - cpu_pcts['idle'])
>     write cpu utilization to a csv file with g.write     
> 
> ## function to compute factorial of a given number
> def factorial(n):
>     num = 1
>     while n >= 1:
>         num = num * n
>         CPU_Percent()  ****This is the function call irt Q 1 below ****
>         n = n - 1
>     return num
> 
> # Main program
> Record start time by using time.time()
> Call function to compute the factorial.
> Record finish time by using time.time()
> write time to compute to a file f.write(totalEndTime - totalStartTime)
> print ("Execution time = ", totalEndTime - totalStartTime)
> 
> 
> Questions:
> 1.  In the factorial() function we call the CPU_Percent() function and write the CPU utilization value to a file.
> -  Is this a correct value or will the CPU utilization below lower because the factorial() function made its calculation and is now just checking the CPU utilization?

I'm not familiar with that package; I just took a quick look at
 pypi. So I'd have to guess. But since your timing is so huge, I'd
 guess that you're measuring utilization during a time period that
 your factorial calculation is paused. In other words you're
 measuring cpu utilization for the other processes in your
 system.

Probably someone else will correct me, but I'd guess you need to
 measure utilization with a separate process.

> -  If we are not getting the true max CPU utilization, can someone offer a design change to accomplish this?
> 
> 2.  What unit does time.time() use?  An example for calculating the factorial of 10 is our program gives:
>   Execution time = ', 1.5703258514404297  I presume this is telling us it took 1.57 seconds to complete the calculation?

It does indeed give results in seconds,  but that value is
 ridiculous. Calculating factorial of 10 takes about 70
 microseconds on this laptop.  And doing it for 10,000 (which
 gives a very large result) takes about  a tenth of a second.
 Including printing it, which takes longer than calculating
 it.

Benchmarking can be extremely tricky,  and I assume you're not
 permitted to use the timeit module. But at the very
 least:

Measure an empty loop and compare it to the real loop. If they
 both measure similar, then you're mostly measuring loop overhead.

Watch out for doing i/o during the timed part of the test; you may
 be mostly measuring console time or file time, and not your
 algorithm. Do your i/o after the ending call to time.time.

If you get times in the microsecond or millisecond range,  put the
 whole mess in a loop so you can do a sanity check with your wrist
 watch. 

Check each systems to make sure time.time works well. Read the
 docs, but do your own tests. Some systems only give you integer
 seconds. 

If you're stuck with overhead that'll affect your results,  do
 some measurements to see how to minimize it. I'd guess that range
 (or xrange, since you're apparently using Python 2.x) will be
 faster than while with increment. 

If you're comparing two entirely different processors, make sure
 you're using exactly the same version of Python. 2.75 on one
 system probably should not be compared with 2.62, or even with
 2.74

Don't forget the effects of other processes, and of disk caching.
 You can orobably minimize them by a fresh boot, and by flushing.

Watch out for memory usage.  You can calculate the factorial of
 one hundred thousand in a few seconds.  But it's some 450
 thousand digits long, and takes quite a bit of memory.

The math module has a factorial function in it. You could use it
 to double check your results and your timings.

-- 
DaveA