Snippet: bitcount()

Tim Peters tim_one at email.msn.com
Sat Jul 17 03:58:19 EDT 1999


[Klaus Alexander posts a pretty bit-counter, taking time proportional to
 the number of 1 bits for ints, and for N-bit longs taking time
 proportional to the product of N and the number of 1 bits]

[Egene Leitl]
> Ouch, ouch. That was really _inefficient_.  At least you could nybble
> or byte-wise lookup,

This isn't C or assembler -- the speed difference for ints in Python isn't
something to get excited about.  Klaus's method has a long pedigree, and is
the method of choice "even in C" when dealing with sparsely populated large
bit vectors.

> and there should be lots of bigmask magic which does in in O(logN).
> Consider bitcounting L's.

There are general methods that work in log2(B) operations where B is the
number of bits in the long.  But each primitive operation (such as a simple
shift) on a B-bit long takes time proportional to B, so the overall time is
O(B * log2(B)).  The worst case of Klaus's method (all bits are set) is
O(B**2).  For B large enough, that's indeed a huge difference -- but unless
B is at least a few hundred bits, the overhead of controlling the advanced
methods cancels the reduction in operations.  (BTW, I'm lying a little here:
the fancy shift-&-mask methods continually cut the number of bits in half,
so that they're really O(B) "in theory" -- and in practice too, for B very
large.)

A curious thing in Python is that you can't march over the bits of a long in
linear time without resorting to hex() or marshal.dumps.  This really
complicates lookup methods, which *can* work in O(B) time straightforwardly
in C.

You can find more than you can bear to read <wink> about this by searching
for "popcount" in DejaNews.

[back to Klaus]
> ...
> -- unless I decide to count several bits at a time, but
> I wouldn't know how to construct such a thing.

Oh, sure you do <wink>:

count[0] = 0
count[1] = 1
count[2] = 1
count[3] = 2
...
count[255] = 8

Now you can get the popcount of any one-byte int with one table-lookup.  And
the popcount of any N-byte int or long with N table lookups and N-1 adds.
Put 2**16 entries in the table to cut the work in half.

repeat-as-needed<wink>-ly y'rs  - tim






More information about the Python-list mailing list