creating very small types

Dan Bishop danb_83 at yahoo.com
Wed Apr 27 20:46:15 EDT 2005


Michael Spencer wrote:
> andrea wrote:
> >>>I was thinking to code the huffman algorithm and trying to
compress
> >>>something with it, but I've got a problem.
> >>>How can I represent for example a char with only 3 bits??
>
> >>>I had a look to the compression modules but I can't understand
them much...
> ...
> > I understand I can't do it easily in python, but maybe I could
define a
> > new type in C and use it to do those dirty works, what do you
think?
>
> Why do you need to create 'very small types'?
>
> You only need actual bit-twiddling when you do the encoding/de-coding
right?
> If you create an encoding map 'codes' as a dict of strings of '1' and
'0',
> encoding might look like (untested):
>
>      def encode(stream):
>          outchar = count = 0
>          for char in stream:
>              for bit in codes[char]:
>                  (outchar << 1) | (bit == "1")
>                  count +=1
>                  if count ==8:
>                      yield chr(outchar)
>                      outchar = count = 0
>          if count:
>              yield chr(outchar)

I wrote some Huffman compression code a few years ago, with

class BitWriter(object):
   # writes individual bits to an output stream
   def __init__(self, outputStream):
      self.__out = outputStream
      self.__bitCount = 0      # number of unwritten bits
      self.__currentByte = 0   # buffer for unwritten bits
   def write(self, bit):
      self.__currentByte = self.__currentByte << 1 | bit
      self.__bitCount += 1
      if self.__bitCount == BYTE_SIZE:
         self.__out.write(chr(self.__currentByte))
         self.__bitCount = 0
         self.__currentByte = 0
   def flush(self):
      while self.__bitCount > 0:
         self.write(0)

class BitReader(object):
   # reads individual bits from an input stream
   def __init__(self, inputStream):
      self.__in = inputStream
      self.__bits = []          # buffer to hold incoming bits
   def readBit(self):
      if len(self.__bits) == 0:
         # read the next byte
         b = ord(self.__in.read(1))
         # unpack the bits
         self.__bits = [(b & (1<<i)) != 0 for i in range(BYTE_SIZE-1,
-1, -1)]
      return self.__bits.pop(0)




More information about the Python-list mailing list