Python speed-up

Wed Sep 22 11:04:10 EDT 2004

Guyon Morée wrote:
> Hi all,
> 
> I am working on a Huffman encoding exercise, but it is kinda slow. This is
> not a big problem, I do this to educate myself :)
> 
> So I started profiling the code and the slowdown was actually taking place
> at places where I didn't expect it.
> 
> after I have created a lookup-table-dictionary with encodings like
> {'d':'0110', 'e':'01' etc} to encode the original text like this:
> 
> for c in original_text:
>     encoded_text += table[c]
 > I can appreciate the length of the text is big, but this isn't a problem at
 > character frequency counting for eaxample. Why is this slow?

Usual suspect: concatenating to strings reallocates a string each time, so you'd 
better do:

l = []
for c in original_text:
   l.append(table[c])
encoded_text = ''.join(l)

> the second place the slowdown occurs is when I ty to chop the encoded string
> of 0's and 1's in pieces of eigth like this:
> 
> chr_list = [] # resulting list
> while 1:
>     chr_list.append(encoded_text[:8]) # take 8 bits from string and put them
> in the list
>     encoded_text = encoded_text[8:] # truncate the string
>     if len(encoded_text) < 8: # end of string reached
>         chr_list.append(encoded_text)
>         break

Again, the statement encoded_text = encoded_text[8:] re-allocates the string 
each time.

Your whole code chunk may be replaced by a list comprehension:

chr_list = [encoded_text[i:i+8] for i in range(0, len(encoded_text), 8)]

which is likely to be faster, since it only allocates the space for each 8 
character chunk.

HTH
-- 
- Eric Brunel <eric (underscore) brunel (at) despammed (dot) com> -
PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com