Performance of int/long in Python 3
Neil Hodgson
nhodgson at iinet.net.au
Tue Apr 2 23:31:03 EDT 2013
Ian Kelly:
> Micro-benchmarks like the ones you have been reporting are *useful*
> when it comes to determining what operations can be better optimized,
> but they are not *important* in and of themselves. What is important
> is that actual, real-world programs are not significantly slowed by
> these kinds of optimizations. Until you can demonstrate that real
> programs are adversely affected by PEP 393, there is not in my opinion
> any regression that is worth worrying over.
The problem with only responding to issues with real-world programs
is that real-world programs are complex and their performance issues
often difficult to diagnose. See, for example, scons which is written in
Python and which has not been able to overcome performance problems over
several years.
(http://www.electric-cloud.com/blog/2010/07/21/a-second-look-at-scons-performance/)
Bottom-up performance work has advantages in that a narrow focus
area can be more easily analyzed and tested and can produce widely
applicable benefits.
The choice of comparison for the script wasn't arbitrary. Comparison
is one of the main building blocks of higher-level code. Sorting, for
example, depends strongly on comparison performance with a decrease in
comparison speed multiplied when applied to sorting.
Its unfortunate that stringbench.py does not contain any comparison
or sorting tests.
Sorting a million string list (all the file paths on a particular
computer) went from 0.4 seconds with Python 3.2 to 0.78 with 3.3 so
we're out of the 'not noticeable by humans' range. Perhaps this is still
a 'micro-benchmark' - I'd just like to avoid adding email access to get
this over the threshold.
Here's some code. Replace the "if 1" with "if 0" on subsequent runs
to avoid the costly file system walk.
import os, time
from os.path import join, getsize
paths = []
if 1:
for root, dirs, files in os.walk('c:\\'):
for name in files:
paths.append(join(root, name))
with open("filelist.txt", "w") as f:
f.write("\n".join(paths))
else:
with open("filelist.txt", "r") as f:
paths = f.read().split("\n")
print(len(paths))
timeStart = time.time()
paths.sort()
timeEnd = time.time()
print("Time taken=", timeEnd - timeStart)
Neil
More information about the Python-list
mailing list