[Python-checkins] r46100 - sandbox/trunk/stringbench/README
andrew.dalke
python-checkins at python.org
Tue May 23 13:07:09 CEST 2006
Author: andrew.dalke
Date: Tue May 23 13:07:09 2006
New Revision: 46100
Added:
sandbox/trunk/stringbench/README
Log:
Info about this benchmark
Added: sandbox/trunk/stringbench/README
==============================================================================
--- (empty file)
+++ sandbox/trunk/stringbench/README Tue May 23 13:07:09 2006
@@ -0,0 +1,67 @@
+stringbench is a set of performance tests comparing byte string
+operations with unicode operations. The two string implementations
+are loosely based on each other and sometimes the algorithm for one is
+faster than the other.
+
+These test set was started at the Need For Speed sprint in Reykjavik
+to identify which string methods could be sped up quickly and to
+identify obvious places for improvement.
+
+Here is an example of a benchmark
+
+
+ at bench('"Andrew".startswith("A")', 'startswith single character', 1000)
+def startswith_single(STR):
+ s1 = STR("Andrew")
+ s2 = STR("A")
+ s1_startswith = s1.startswith
+ for x in _RANGE_1000:
+ s1_startswith(s2)
+
+The bench decorator takes three parameters. The first is a short
+description of how the code works. In most cases this is Python code
+snippet. It is not the code which is actually run because the real
+code is hand-optimized to focus on the method being tested.
+
+The second parameter is a group title. All benchmarks with the same
+group title are listed together. This lets you compare different
+implementations of the same algorithm, such as "t in s"
+vs. "s.find(t)".
+
+The last is a count. Each benchmark loops over the algorithm either
+100 or 1000 times, depending on the algorithm performance. The output
+time is the time per benchmark call so the reader needs a way to know
+how to scale the performance.
+
+These parameters become function attributes.
+
+
+Here is an example of the output
+
+
+========== count newlines
+38.54 41.60 92.7 ...text.with.2000.newlines.count("\n") (*100)
+========== early match, single character
+1.14 1.18 96.8 ("A"*1000).find("A") (*1000)
+0.44 0.41 105.6 "A" in "A"*1000 (*1000)
+1.15 1.17 98.1 ("A"*1000).index("A") (*1000)
+
+The first column is the run time in milliseconds for byte strings.
+The second is the run time for unicode strings. The third is a
+percentage; byte time / unicode time. It's the percentage by which
+unicode is faster than byte strings.
+
+The last column contains the code snippet and the repeat count for the
+internal benchmark loop.
+
+The times are computed with 'timeit.py' which repeats the test more
+and more times until the total time takes over 0.2 seconds, returning
+the best time for a single iteration.
+
+The final line of the output is the cumulative time for byte and
+unicode strings, and the overall performance of unicode relative to
+bytes. For example
+
+4079.83 5432.25 75.1 TOTAL
+
+However, this has no meaning as it evenly weights every test.
More information about the Python-checkins
mailing list