Bits and bytes?
Terry Reedy
tjreedy at udel.edu
Thu Jun 13 10:41:06 EDT 2002
"Guyon Morée" <gumuz at looze.net> wrote in message
news:3d08839a$0$226$4d4ebb8e at news.nl.uu.net...
> i am trying to implement my own sort of huffman
encoding/compression.
> converting asci-characters to a compressed bitstream requires me to
write it
> at bit-level. Bill's solution might work in some way, but my
sequence of
> bits aren't allways divideable by eight. a word like 'ape' (which is
3x8=24
> bits) might become 1001101, which is 7 bits.... catch my drift?
strings is
> no option as this would have no positive effect on the 'real size'
;there's
> no compression.
For production use, bit twiddling should be done in assembler, C, or
other suited language. For algorithm development in Python, you could
do something like the following:
charmap = <map of tuples of 8 0s and 1s to corresponding char> # (
work = []
out = []
for char in plaintext:
work.append(cipher(char)) # cipher -> list of 0s and 1s
if len(work) >= 8:
out.push(charmap(tuple(work[0:8])))
work[:] = work[8:]
ciphertext = ''.join(out)
or make 'out' an array from array module to avoid final .join()
Terry J. Reedy
More information about the Python-list
mailing list