[Baypiggies] Discussion for newbies/beginner night talks - test results
Dennis Reinhardt
DennisR at dair.com
Thu Feb 15 21:22:43 CET 2007
At 11:21 PM 2/14/2007, Drew Perttula wrote:
>Nope, the square brackets are being used in a novel way that creates stan
>structures,
Summary
-------
Appending many string fragments is done most efficiently using a dict in
both time and memory under extreme conditions. Text append gives better
memory results than list append. Under realistic conditions, there was
little reason to choose text over list over dict.
Discussion
----------
This discussion and Stan got me thinking. Today, I want to tackle code
which generates a graphical directory display of icons. It is possible for
people to have directories of thousands (maybe even 10s of thousands) of
icons. Efficiency may matter.
The convention wisdom, IIRC, is to do list appends rather than string
appends. I wrote a program to test this, measuring execution time and
final memory footprint under windows. Here is the code:
import time
def holding():
#l = ""
#l = [""]
l = {}
for x in range(0,100000):
#l += "abcdef"
#l += ["abcdef"]
#l.append("abcdef")
l[x] = "abcdef"
time.sleep(1)
print "enter holding"
for a in range(0,10):
start = time.time()
for b in range(0,10):
holding()
print "%3.5f" % (time.time()-start)
time.sleep(1)
print "exit holding"
time.sleep(2000)
The code uses "range" because I want to run it under 2.3, although all the
results reported here are for 2.5. Here are the results:
TIME (10 calls ea) 100K_iter_1_str 500K_iter_1_str 100K_iter_5_str
1K_iter_5_str
text 0.422 15.656 14.937 0.016
list+= 0.438 2.2 0.421 0.016
list_append 0.219 1.188 0.219 0.016
dict 0.203 1.031 0.204 0
MEM 100K_iter_1_str 500K_iter_1_str 100K_iter_5_str
1K_iter_5_str
text 6.196 10.356 5.44 3.548
list+= 7.196 11.92 7.192 3.56
list_append 7.192 11.92 7.196 3.56
dict 5.66 9.548 5.656 3.588
The TIME results are in seconds for 10 calls to holding(), as should be
clear from the code. The best and worst case timings among the 10 runs did
not vary much (usually second decimal) and so for simplicity I simply
reported worst case timing.
The MEM results are in megabytes of memory at the end of run (at "exit
holding"). At start of run before entering holding(), memory used was 3.5
MB. Clearly, there is a systematic growth in memory. The final
time.sleep(2000) is to allow time for garbage collection to reduce memory
usage. In one run, I waited 10 minutes and saw no change in memory
used. The memory used fluctuated far more than time.
My expectation for the memory results is that all memory internal to
holding() would be released because all variables were local. This
expectation does not appear to be met.
Across the columns, "100K_iter" means the
for x in range(0,100000):
statement had an upper limit of 100000 (as shown) while 500K_iter says the
upper limit is 5 times greater and so on. "1_str" means the string used is
"abcdef" (as shown) while 5_str means "abcdefabcdefabcdefabcdefabcdef" was
used instead.
For the rows, text means that the
#l = ""
#l += "abcdef"
statements were uncommented while dict is uncommented in the code show.
Evaluation
------------
I am looking at HTML generation. A 100K HTML file is *huge*. Doing 100K
or 500K appends of a 6 or 30 character string are "stress tests" which I
simply do not expect to reach. I would like to know what the behavior is
but these levels are not the design point. The final column showing 1000
appends of a 30 character string represent a typical design point.
At 1K iterations, the time differences do not matter. I am using a Dual
core 2 GHz machine for timing. Even a machine 20X slower is still
acceptable for generating a local html page. If I were writing a multiuser
server, the timing would matter. In my app, it simply does not matter at
my design point.
What matters more to me is memory footprint. The list append tests show
larger memory footprint under all conditions tested. Under realistic test
conditions, there is little difference in memory footprint and I would
expect that running more iterations or changing the test conditions could
affect the rankings.
Under realistic conditions, there is not much difference in either time or
memory footprint.
---------------------------------
| Dennis | DennisR at dair.com |
| Reinhardt | http://www.dair.com |
---------------------------------
More information about the Baypiggies
mailing list