Most efficient way of storing 1024*1024 bits

Alex Martelli aleax at mail.comcast.net
Thu Nov 3 10:01:28 EST 2005


Alex Stapleton <alexs at advfn.com> wrote:
   ...
> >>> Six megabytes is pretty much nothing on a modern computer.
> >
> >> BTW, it'd be 6 megabits or 750kb ;)
> >
> > ...but Mike was proposing using one digit per bit, hence, 6 megabytes.
> > That makes it easy to search for bitpatterns with re or  
> > string.find; if
> > the bits were packed 8 to a byte, such searches would be very hard.
> 
> They would just require some out-of-the-box thinking using character
> arrays and stuff I think. It's definately still doable with regex's  
> if you really want to.

"Doable", yes, of course -- that's pretty obvious, and I'm not sure
what's "out of the box" about it -- but orders of magnitude harder.  For
example, to look for a fixed sequence of X bits, you'd have to search
for any of 8 sequences of slightly more than X/8 characters (where the
first and last are normally charsets) built by the possible shifts of
the bitsequence by 0, 1, ... 7 bits.  I also suspect that performance
would badly suffer (dealing with 750 KB of data, plus all the auxiliary
stuff for Python and re, isn't going to fit in cache anyway).  All in
all, doesn't feel worth pursuing, particularly when the OP mentioned
time mattering more than space right from the word go.


Alex



More information about the Python-list mailing list