Snippet: bitcount()

Christian Tismer tismer at appliedbiometrics.com
Sat Jul 17 07:24:55 EDT 1999


Klaus Alexander Seistrup wrote:
> 
> Tim Peters <tim_one at email.msn.com> wrote:
> 
> > > -- unless I decide to count several bits at a time, but
> > > I wouldn't know how to construct such a thing.
> >
> > Oh, sure you do <wink>:
> >
> > count[0] = 0
> > count[1] = 1
> > count[2] = 1
> > count[3] = 2
> > ...
> > count[255] = 8
> >
> > Now you can get the popcount of any one-byte int with one table-lookup.
> > And the popcount of any N-byte int or long with N table lookups and N-1
> > adds.  Put 2**16 entries in the table to cut the work in half.
> 
> I get your point, thanks, but I will spare this group for a posting of such
> a 2**16 table, leave alone the full 32-bit version. ;-)

FYI, here is where Tim and I stopped a "race".
We both claimed to have the fastest version of bit counting.
Apparently, results were dependant from your CPU's cache size.
Mine had 512KB, Tims only 256KB, so on his machine he was faster,
and on mine, I was. We decided to stop there.

Here is Tim's most elegant last version which I like the best:

______________________________________________________________

_run2mask = {1: 0x5555555555555555L,
             2: 0x3333333333333333L,
             4: 0x0F0F0F0F0F0F0F0FL,
             8: 0x00FF00FF00FF00FFL}

def buildmask2(run, n):
    run2 = run + run
    k = (n + run2 - 1) / run2
    n = k * run2
    try:
        answer = _run2mask[run]
        k2 = 64 / run2
    except KeyError:
        answer = (1L << run) - 1
        k2 = 1
    while k > k2:
        k2 = k2 + k2
        if k >= k2:
            answer = answer | (answer << (run * k2))
        else:
            answer = answer | (answer << (run2 * (k - k2/2)))
            k2 = k
    if k2 > k:
        answer = answer >> (run2 * (k2 - k))
    return answer, n

def popcount4(x):
    lomask, n = buildmask2(1, nbits(x))
    x = x - ((x >> 1) & lomask)
    target = 2
    while n > target:
        lomask, n = buildmask2(target, n)
        x = (x & lomask) + ((x >> target) & lomask)
        target = target + target
        for i in range(1, target/2):
            if n <= target:
                break
            n = n >> 1
            n = (n + target - 1) / target * target
            x = (x & ((1L << n) - 1)) + (x >> n)
    return int(x)
______________________________________________________________

And here is my dirty, brutal hack which is at the "same" speed
for sufficiently inexact values of "same":
______________________________________________________________

import marshal, operator, array, string

def makeweighttrans() :
    weights = map(popcount, range(256))
    return string.join(map(chr, weights), '')

def popcount6(x, dumps = marshal.dumps,
                 add = operator.add,
                 trans = makeweighttrans(),
                 reduce = reduce,
                 translate = string.translate,
                 mkarray = array.array):
    digitcounts = mkarray("B", translate(dumps(x), trans))
    # overwrite type code & digit count
    digitcounts[0] = digitcounts[1] = digitcounts[2] = \
                     digitcounts[3] = digitcounts[4] = 0
    return reduce(add, digitcounts)
    
______________________________________________________________


So that' where you can get when trying to count bits.
I can't encourage you to try to outperform one of the
above and don't want to start another race (at least
I will not try). But if somebody comes up with a solution
which is, say, 30 percent faster for any distribution of
bits, and it uses standard Python only, then I will
give you a free website and pay your fee for the next
Python conference.

and-maybe-employ-you-in-my-company-ly y'rs - chris

p.s.: You can shorten the above by Mike Hudson's bytecodehacks,
but it will not get faster. Actually a tick slower under Win9X,
due to the way how VC++6.0 optimizes LOAD_FAST and LOAD_CONST
slightly different (they *are* technically equivalent).

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home




More information about the Python-list mailing list