BitVector
Ross Ridge
rridge at caffeine.csclub.uwaterloo.ca
Mon Mar 10 14:52:00 EDT 2008
DanielJohnson <diffuser78 at gmail.com> wrote:
>I am trying to solve a genetic algorithm problem where I want to read
>a bitvector of very large size (say 10000) and manipulate bits based
>on certain algorithms.
>
>I am a newbie in Python. What data structure are good to read such
>huge data set. Are there any built in classes for bit fiddling.
A 10000 bit data set is by no means huge. Depending on exactly what
operations you want to perform converting your "bitvector" to a byte
array, one bit per byte may be fast way to go. This increases the size
of data set by 8 times, but it's probably the fastest way in Python to
test and set individual bits.
Here's a couple of functions I wrote for very quickly converting a
"bitvector" in the form of a string (eg. read from a file) in to a
byte array:
import string
import array
import binascii
_tr_16 = string.maketrans("0123456789abcdef",
"\x00\x01\x02\x03"
"\x10\x11\x12\x13"
"\x20\x21\x22\x23"
"\x30\x31\x32\x33")
_tr_4 = string.maketrans("0123",
"\x00\x01"
"\x10\x11")
_tr_2 = string.maketrans("01", "\x00\x01")
def string_to_bit_array(s):
"""Convert a string to an array containing a sequence of bits."""
s = binascii.hexlify(s).translate(_tr_16)
s = binascii.hexlify(s).translate(_tr_4)
s = binascii.hexlify(s).translate(_tr_2)
a = array.array('B', s)
return a
_tr_rev_2 = string.maketrans("\x00\x01", "01")
_tr_rev_4 = string.maketrans("\x00\x01"
"\x10\x11",
"0123")
_tr_rev_16 = string.maketrans("\x00\x01\x02\x03"
"\x10\x11\x12\x13"
"\x20\x21\x22\x23"
"\x30\x31\x32\x33",
"0123456789abcdef")
def bit_array_to_string(a):
"""Convert an array containing a sequence of bits to a string."""
remainder = len(a) % 8
if remainder != 0:
a.fromlist([0] * (8 - remainder))
s = a.tostring()
s = binascii.unhexlify(s.translate(_tr_rev_2))
s = binascii.unhexlify(s.translate(_tr_rev_4))
return binascii.unhexlify(s.translate(_tr_rev_16))
I've used these functions to implement a data compression algorithim
in Python. The algorithm still runs very slow in Python, but it runs
much faster than it would have if I had used Python's bitwise operators.
Ross Ridge
--
l/ // Ross Ridge -- The Great HTMU
[oo][oo] rridge at csclub.uwaterloo.ca
-()-/()/ http://www.csclub.uwaterloo.ca/~rridge/
db //
More information about the Python-list
mailing list