Interesting timing issue I noticed

Daniel Fetchinson fetchinson at googlemail.com
Tue Apr 15 23:31:52 EDT 2008


On 4/15/08, Daniel Fetchinson <fetchinson at googlemail.com> wrote:
> > Can I then simply ignore the time data then? I do see better performance
> > obviously the smaller the box is, but I guess my issues is how seriously
> to
> > take all this data. Because I can't claim "performance improvement" if
> there
> > isn't really much of an improvement.
> >
> > On Tue, Apr 15, 2008 at 11:04 PM, Daniel Fetchinson <
> > fetchinson at googlemail.com> wrote:
> >
> > >  > I've written up a stripped down version of the code. I apologize for
> > > the bad
> > > > coding; I am in a bit of a hurry.
> > > >
> > > > import random
> > > > import sys
> > > > import time
> > > >
> > > > sizeX = 320
> > > > sizeY = 240
> > > > borderX = 20
> > > > borderY = 20
> > > >
> > > > # generates a zero matrix
> > > > def generate_zero():
> > > >     matrix = [[0 for y in range(sizeY)] for x in range(sizeX)]
> > > >     return matrix
> > > > # fills zero matrix
> > > > def fill_matrix(in_mat):
> > > >     mat = in_mat
> > > >     for x in range(sizeX):
> > > >         for y in range(sizeY):
> > > >             mat[x][y] = random.randint(1, 100)
> > > >     return mat
> > > > ######################################################################
> > > > # COMPUTES ONLY A PART OF THE ARRAY
> > > > def back_diff_one(back_array, fore_array, box):
> > > >     diff_array = generate_zero()
> > > >
> > > >     start = time.time()
> > > >     for x in range(sizeX):
> > > >         for y in range(borderY):
> > > >             diff_array[x][y] = back_array[x][y] - fore_array[x][y]
> > > >         for y in range((sizeY - borderY), sizeY):
> > > >             diff_array[x][y] = back_array[x][y] - fore_array[x][y]
> > > >     for y in range(borderY, (sizeY - borderY)):
> > > >         for x in range(borderX):
> > > >             diff_array[x][y] = back_array[x][y] - fore_array[x][y]
> > > >         for x in range((sizeX - borderX), sizeX):
> > > >             diff_array[x][y] = back_array[x][y] - fore_array[x][y]
> > > >
> > > >     # tracks object
> > > >     if (len(box) != 0):
> > > >         for x in range(box[0], box[2]):
> > > >             for y in range(box[1], box[3]):
> > > >                 diff_array[x][y] = back_array[x][y] - fore_array[x][y]
> > > >     print "time one inside = " + str(time.time() - start)
> > > >     return diff_array
> > > > ######################################################################
> > > > # COMPUTES EVERY ELEMENT IN THE ARRAY
> > > > def back_diff_two(back_array, fore_array):
> > > >     diff_array = generate_zero()
> > > >     start = time.time()
> > > >     for y in range(sizeY):
> > > >         for x in range(sizeX):
> > > >             diff_array[x][y] = back_array[x][y] - fore_array[x][y]
> > > >     end = time.time()
> > > >     print "time two inside = " + str(end - start)
> > > >     return diff_array
> > > > ######################################################################
> > > > # CODE TO TEST BOTH FUNCTIONS
> > > > back = fill_matrix(generate_zero())
> > > > fore = fill_matrix(generate_zero())
> > > > box = [20, 20, 268, 240]
> > > > start1 = time.time()
> > > > diff1 = back_diff_one(back, fore, box)
> > > > print "time one outside = " + str(time.time() - start1)
> > > > start2 = time.time()
> > > > diff2 = back_diff_two(back, fore)
> > > > print "time one outside = " + str(time.time() - start2)
> > > >
> > > > Here are some results from several test runs:
> > > >
> > > > time one inside = 0.0780000686646
> > > > time one outside = 0.125
> > > > time two inside = 0.0780000686646
> > > > time two outside = 0.141000032425
> > > > >>> ================================ RESTART
> > > > ================================
> > > > >>>
> > > > time one inside = 0.0629999637604
> > > > time one outside = 0.125
> > > > time two inside = 0.0789999961853
> > > > time two outside = 0.125
> > > > >>> ================================ RESTART
> > > > ================================
> > > > >>>
> > > > time one inside = 0.0620000362396
> > > > time one outside = 0.139999866486
> > > > time two inside = 0.0780000686646
> > > > time two outside = 0.125
> > > > >>> ================================ RESTART
> > > > ================================
> > > > >>>
> > > > time one inside = 0.0780000686646
> > > > time one outside = 0.172000169754
> > > > time two inside = 0.0789999961853
> > > > time two outside = 0.125
> > > > >>> ================================ RESTART
> > > > ================================
> > > > >>>
> > > > time one inside = 0.0780000686646
> > > > time one outside = 0.125
> > > > time two inside = 0.0780000686646
> > > > time two outside = 0.125
> > > > >>> ================================ RESTART
> > > > ================================
> > > > >>>
> > > > time one inside = 0.0620000362396
> > > > time one outside = 0.155999898911
> > > > time two inside = 0.0780000686646
> > > > time two outside = 0.125
> > > > >>> ================================ RESTART
> > > > ================================
> > > > >>>
> > > > time one inside = 0.077999830246
> > > > time one outside = 0.125
> > > > time two inside = 0.077999830246
> > > > time two outside = 0.125
> > > > >>> ================================ RESTART
> > > > ================================
> > > > >>>
> > > > time one inside = 0.0780000686646
> > > > time one outside = 0.171000003815
> > > > time two inside = 0.077999830246
> > > > time two outside = 0.125
> > > > >>> ================================ RESTART
> > > > ================================
> > > > >>>
> > > > time one inside = 0.0629999637604
> > > > time one outside = 0.18799996376
> > > > time two inside = 0.0620000362396
> > > > time two outside = 0.125
> > > >
> > > > Why is a large percentage of the time, the execution time for the
> > > > (ostensibly smaller) first loop is actually equal to or LARGER than
> the
> > > > second?
> > >
> > >
> > > First of all, your method of timing is not the best. Use the timeit
> > > module instead: http://docs.python.org/lib/module-timeit.html
> > >
> > > Second of all the number of subtractions is not that different between
> > > the two variants of your functions. back_diff_one does 75360
> > > subtractions per call while back_diff_two does 76800, these two
> > > numbers are almost the same. It's true that back_diff_one first only
> > > calculates a part of the arrays but after "# tracks object" you do a
> > > bunch of more substractions that will make up the total count.
> > >
> > > HTH,
> > > Daniel
> > >
>
> Please keep the discussion on the list.
>
> Yes, if I were you I would discard your original timing data and redo
> it using the timeit module. Whatever that gives should be reliable and
> you can start from there. In any case your two functions are doing
> roughly the same number of operations.
>
> HTH,
> Daniel
>


BTW, using the following



######################################################################
# CODE TO TEST BOTH FUNCTIONS
back = fill_matrix(generate_zero())
fore = fill_matrix(generate_zero())
box = [20, 20, 268, 240]

def test1( ):
    diff1 = back_diff_one(back, fore, box)

def test2( ):
    diff2 = back_diff_two(back, fore)

if __name__=='__main__':
    from timeit import Timer
    t = Timer("test1( )", "from __main__ import test1")
    print t.timeit( 50 )
    t = Timer("test2( )", "from __main__ import test2")
    print t.timeit( 50 )


and removing all your timing code from the two functions gives

1.63772082329
1.82889485359

which is consistent with your expectation that the version that
computes only a part of the image runs faster. But only marginally,
which is again consistent with the fact that the number of operations
is only marginally different.

HTH,
Daniel



More information about the Python-list mailing list