Speed Race between old and new version 'working with files'

Steve D'Aprano steve+python at pearwood.info
Fri Oct 27 20:13:25 EDT 2017


On Sat, 28 Oct 2017 09:11 am, japy.april at gmail.com wrote:

> import time
> 
> avg = float(0)

That should be written as 

avg = 0.0

or better still not written at all, as it is pointless.


> # copy with WITH function and execute time
> for i in range(500):
>     start = time.clock()

time.clock() is an old, platform-dependent, low-resolution timer. It has been
deprecated in recent versions of Python. It is better to use:

from timeit import default_timer as clock

and use that, as the timeit module has already chosen the best timer available
on your platform.

In fact, you probably should be using timeit rather than re-inventing the
wheel, unless you have a good reason.

>     with open('q://my_projects/cricket.mp3', 'rb') as old,
>     open('q://my_projects/new_cricket.mp3', 'wb') as new:
>         for j in old:
>             new.write(j)

Reading a *binary file* line-by-line seems rather dubious to me. I wouldn't do
it that way, but for now we'll just keep it.

The simplest way to do this time comparison would be: 


oldfile = 'q://my_projects/cricket.mp3'
newfile = 'q://my_projects/new_cricket.mp3'

from timeit import Timer

def test_with():
    with open(oldfile, 'rb') as old, \
            open(newfile, 'wb') as new:
        for line in old:
            new.write(line)

def test_without():
    old = open(oldfile, 'rb')
    new = open(newfile, 'wb')
    for line in old:
        new.write(line)
    old.close()
    new.close()

setup = "from __main__ import test_with, test_without"

t1 = Timer("test_with()", setup)
t2 = Timer("test_without()", setup)

print('Time using with statement:')
print(min(t1.repeat(number=100, repeat=5)))
print('Time not using with statement:')
print(min(t2.repeat(number=100, repeat=5)))


On my computer, that gives a value of about 5.3 seconds and 5.2 seconds
respectively. That figure should be interpreted as:

- the best value of five trials (the 'repeat=5' argument);

- each trial calls the test_with/test_without function 100 times
  (the 'number=100' argument)

so on my computer, each call to the test function takes around 5/100 seconds,
or 50ms. So there's no significant speed difference between the two: using
the with statement is a tiny bit slower (less than 2% on my computer).

[...]
> avg += (stop - start) / 500

Better to skip the pointless initialision of avg and just write:

avg = (stop - start)/500

> avg += (stop - start) / 500
> print('Execute time with OLD version : ', avg)

That *adds* the time of the second test to the original timing, so your last
line should be:

print('Execute time with NEW version plus time with OLD version : ', avg)

to be accurate. But I don't think that's what you intended.




-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list