Bits and bytes?

Terry Reedy tjreedy at udel.edu
Thu Jun 13 10:41:06 EDT 2002


"Guyon Morée" <gumuz at looze.net> wrote in message
news:3d08839a$0$226$4d4ebb8e at news.nl.uu.net...
> i am trying to implement my own sort of huffman
encoding/compression.
> converting asci-characters to a compressed bitstream requires me to
write it
> at bit-level. Bill's solution might work in some way, but my
sequence of
> bits aren't allways divideable by eight. a word like 'ape' (which is
3x8=24
> bits) might become 1001101, which is 7 bits.... catch my drift?
strings is
> no option as this would have no positive effect on the 'real size'
;there's
> no compression.

For production use, bit twiddling should be done in assembler, C, or
other suited language.  For algorithm development in Python, you could
do something like the following:

  charmap = <map of tuples of 8 0s and 1s to corresponding char> # (
  work = []
  out = []
  for char in plaintext:
    work.append(cipher(char)) # cipher -> list of 0s and 1s
    if len(work) >= 8:
      out.push(charmap(tuple(work[0:8])))
      work[:] = work[8:]
  ciphertext = ''.join(out)

or make 'out' an array from array module to avoid final .join()

Terry J. Reedy






More information about the Python-list mailing list