[Python-3000] characters data type

Josiah Carlson jcarlson at uci.edu
Thu May 4 21:19:02 CEST 2006


Josiah Carlson <jcarlson at uci.edu> wrote:
> Good point.  Making the input 1025 bytes, and performing block[:-1]
> resulted in a running time of 13.94 seconds.

I just thought of a better way of benchmarking list-like over-allocation
semantics.

For assumed smaller-sized writes:

Use an array, and manually extend using extensions which are about as
long as list overallocation, which would under-count the amount of time
it would take to construct bytes object with generally small incremental
writes.

Then, use the smallest overallocation size as the presumed size of
writes in the list.append()/''.join() case.  This should reasonably
count the amount of time to generate the string in this case.

If the under-counting for list-like overallocation is about the same or
slower than the append/join, then append/join is going to be faster in
practice for small writes.


For larger blocks, do the same thing, only increase the 'overallocations'
to be at least as large as the the assumed block size.

               135 byte blocks    1k blocks
append/join:     28.05s             12.11s
list-like:       31.21s             31.50s

Seems to be a clear win for append/join on both 135 byte and 1k blocks
for constructing a 16 meg string. Reducing block size to 64 bytes gives
list-like overallocation the benefit (append/join jumps to 50+
seconds), which tells us that for very short blocks, list-like
overallocation wins, but for blocks of expected ~135 bytes or larger on
my machine, append/join wins.


 - Josiah



More information about the Python-3000 mailing list