Packing data

Thu May 2 18:54:19 EDT 2002

bvdpoel at uniserve.com wrote:

> Thanks ... yet another way to do it! Damn Python--too many choices :)
> 

Well, and yet another one ;)

If you can store your data in a tuple to begin with, this one is almost a 
factor of two faster:

#----------------------------------------------------------------
import string, time

def time_fun(f):
    start = time.time()
    f()
    elapsed = time.time() - start
    print f, round(elapsed, 2)
    return elapsed

size = int(1e7)
L = (42,)*size

def listcomp_test():
    return string.join([chr(x) for x in L], '')

def map_test():
    return ''.join(map(chr, L))

def fmt_test():
    return '%c'*len(L) % L

to_test = fmt_test

assert map_test() == to_test()
print 'size:',size
times = []
for f in map_test, to_test:
    times.append(time_fun(f))
print 'ratio: %.2f' % (times[0]/times[1])
#----------------------------------------------------------------

The results are (checking various sizes to make sure they both scale equally):

In [20]: run datapack.py
size: 100000
<function map_test at 0x828130c> 0.2
<function fmt_test at 0x827333c> 0.12
ratio: 1.69

In [21]: run datapack.py
size: 1000000
<function map_test at 0x829e574> 1.81
<function fmt_test at 0x828017c> 1.03
ratio: 1.76

In [22]: run datapack.py
size: 3000000
<function map_test at 0x829c1d4> 5.35
<function fmt_test at 0x8146c3c> 3.06
ratio: 1.75

In [23]: run datapack.py
size: 10000000
<function map_test at 0x826eb1c> 17.85
<function fmt_test at 0x80fe86c> 10.2
ratio: 1.75

If your data is in a list and must be so, you have to call tuple(L) before 
running it through the % operator. The improvement then is a bit less, but 
still significant. And I think the code is clearer anyway.

Which brings me to a question: why doesn't % accept a list on its rhs? Is 
there any strong reason to force it to be a tuple? Just curious, as I often 
find myself wrapping tuple(my_list) calls on things for %.

Cheers,

f.