Interesting timing issue I noticed

Daniel Fetchinson fetchinson at googlemail.com
Tue Apr 15 23:26:19 EDT 2008


> Can I then simply ignore the time data then? I do see better performance
> obviously the smaller the box is, but I guess my issues is how seriously to
> take all this data. Because I can't claim "performance improvement" if there
> isn't really much of an improvement.
>
> On Tue, Apr 15, 2008 at 11:04 PM, Daniel Fetchinson <
> fetchinson at googlemail.com> wrote:
>
> >  > I've written up a stripped down version of the code. I apologize for
> > the bad
> > > coding; I am in a bit of a hurry.
> > >
> > > import random
> > > import sys
> > > import time
> > >
> > > sizeX = 320
> > > sizeY = 240
> > > borderX = 20
> > > borderY = 20
> > >
> > > # generates a zero matrix
> > > def generate_zero():
> > >     matrix = [[0 for y in range(sizeY)] for x in range(sizeX)]
> > >     return matrix
> > > # fills zero matrix
> > > def fill_matrix(in_mat):
> > >     mat = in_mat
> > >     for x in range(sizeX):
> > >         for y in range(sizeY):
> > >             mat[x][y] = random.randint(1, 100)
> > >     return mat
> > > ######################################################################
> > > # COMPUTES ONLY A PART OF THE ARRAY
> > > def back_diff_one(back_array, fore_array, box):
> > >     diff_array = generate_zero()
> > >
> > >     start = time.time()
> > >     for x in range(sizeX):
> > >         for y in range(borderY):
> > >             diff_array[x][y] = back_array[x][y] - fore_array[x][y]
> > >         for y in range((sizeY - borderY), sizeY):
> > >             diff_array[x][y] = back_array[x][y] - fore_array[x][y]
> > >     for y in range(borderY, (sizeY - borderY)):
> > >         for x in range(borderX):
> > >             diff_array[x][y] = back_array[x][y] - fore_array[x][y]
> > >         for x in range((sizeX - borderX), sizeX):
> > >             diff_array[x][y] = back_array[x][y] - fore_array[x][y]
> > >
> > >     # tracks object
> > >     if (len(box) != 0):
> > >         for x in range(box[0], box[2]):
> > >             for y in range(box[1], box[3]):
> > >                 diff_array[x][y] = back_array[x][y] - fore_array[x][y]
> > >     print "time one inside = " + str(time.time() - start)
> > >     return diff_array
> > > ######################################################################
> > > # COMPUTES EVERY ELEMENT IN THE ARRAY
> > > def back_diff_two(back_array, fore_array):
> > >     diff_array = generate_zero()
> > >     start = time.time()
> > >     for y in range(sizeY):
> > >         for x in range(sizeX):
> > >             diff_array[x][y] = back_array[x][y] - fore_array[x][y]
> > >     end = time.time()
> > >     print "time two inside = " + str(end - start)
> > >     return diff_array
> > > ######################################################################
> > > # CODE TO TEST BOTH FUNCTIONS
> > > back = fill_matrix(generate_zero())
> > > fore = fill_matrix(generate_zero())
> > > box = [20, 20, 268, 240]
> > > start1 = time.time()
> > > diff1 = back_diff_one(back, fore, box)
> > > print "time one outside = " + str(time.time() - start1)
> > > start2 = time.time()
> > > diff2 = back_diff_two(back, fore)
> > > print "time one outside = " + str(time.time() - start2)
> > >
> > > Here are some results from several test runs:
> > >
> > > time one inside = 0.0780000686646
> > > time one outside = 0.125
> > > time two inside = 0.0780000686646
> > > time two outside = 0.141000032425
> > > >>> ================================ RESTART
> > > ================================
> > > >>>
> > > time one inside = 0.0629999637604
> > > time one outside = 0.125
> > > time two inside = 0.0789999961853
> > > time two outside = 0.125
> > > >>> ================================ RESTART
> > > ================================
> > > >>>
> > > time one inside = 0.0620000362396
> > > time one outside = 0.139999866486
> > > time two inside = 0.0780000686646
> > > time two outside = 0.125
> > > >>> ================================ RESTART
> > > ================================
> > > >>>
> > > time one inside = 0.0780000686646
> > > time one outside = 0.172000169754
> > > time two inside = 0.0789999961853
> > > time two outside = 0.125
> > > >>> ================================ RESTART
> > > ================================
> > > >>>
> > > time one inside = 0.0780000686646
> > > time one outside = 0.125
> > > time two inside = 0.0780000686646
> > > time two outside = 0.125
> > > >>> ================================ RESTART
> > > ================================
> > > >>>
> > > time one inside = 0.0620000362396
> > > time one outside = 0.155999898911
> > > time two inside = 0.0780000686646
> > > time two outside = 0.125
> > > >>> ================================ RESTART
> > > ================================
> > > >>>
> > > time one inside = 0.077999830246
> > > time one outside = 0.125
> > > time two inside = 0.077999830246
> > > time two outside = 0.125
> > > >>> ================================ RESTART
> > > ================================
> > > >>>
> > > time one inside = 0.0780000686646
> > > time one outside = 0.171000003815
> > > time two inside = 0.077999830246
> > > time two outside = 0.125
> > > >>> ================================ RESTART
> > > ================================
> > > >>>
> > > time one inside = 0.0629999637604
> > > time one outside = 0.18799996376
> > > time two inside = 0.0620000362396
> > > time two outside = 0.125
> > >
> > > Why is a large percentage of the time, the execution time for the
> > > (ostensibly smaller) first loop is actually equal to or LARGER than the
> > > second?
> >
> >
> > First of all, your method of timing is not the best. Use the timeit
> > module instead: http://docs.python.org/lib/module-timeit.html
> >
> > Second of all the number of subtractions is not that different between
> > the two variants of your functions. back_diff_one does 75360
> > subtractions per call while back_diff_two does 76800, these two
> > numbers are almost the same. It's true that back_diff_one first only
> > calculates a part of the arrays but after "# tracks object" you do a
> > bunch of more substractions that will make up the total count.
> >
> > HTH,
> > Daniel
> >

Please keep the discussion on the list.

Yes, if I were you I would discard your original timing data and redo
it using the timeit module. Whatever that gives should be reliable and
you can start from there. In any case your two functions are doing
roughly the same number of operations.

HTH,
Daniel



More information about the Python-list mailing list