How to locate the bit in bits string?

Dave Angel davea at ieee.org
Wed Apr 29 22:35:20 EDT 2009


casevh wrote:
> On Apr 28, 5:39 pm, Li Wang <li.wan... at gmail.com> wrote:
>   
>> 2009/4/29 Tim Chase <python.l... at tim.thechases.com>:
>>
>>     
>>>> I want to concatenate two bits string together: say we have '1001' and
>>>> '111' which are represented in integer. I want to concatenate them to
>>>> '1001111' (also in integer form), my method is:
>>>> ('1001' << 3) | 111
>>>> which is very time consuming.
>>>>         
>>> You omit some key details -- namely how do you know that "1001" is 4 bits
>>> and not "00001001" (8-bits)?  If it's a string (as your current code shows),
>>> you can determine the length.  However, if they are actually ints, your code
>>> should work fine & be O(1).
>>>       
>> Actually, what I have is a list of integer numbers [3,55,99,44], and
>> by using Huffman coding or fixed length coding, I will know how the
>> bits-length for each number. When I try to concatenate them (say
>> 10,000 items in the list) all together, the speed is going down
>> quickly (because of the shifting operations of python long).
>>
>>
>>
>>     
>>> This can be abstracted if you need:
>>>       
>>>  def combine_bits(int_a, int_b, bit_len_b):
>>>    return (int_a << bit_len_b) | int_b
>>>       
>>>  a =x09
>>>  b =x07
>>>  print combine_bits(a, b, 3)
>>>       
>>> However, if you're using gargantuan ints (as discussed before), it's a lot
>>> messier.  You'd have to clarify the storage structure (a byte string?  a
>>> python long?)
>>>       
>> I am using a single python long to store all the items in the list
>> (say, 10,000 items), so the work does become messier...
>>     
>
> Using GMPY (http://code.google.com/p/gmpy/) may offer a performance
> improvement. When shifting multi-thousand bit numbers, GMPY is several
> times faster than Python longs. GMPY also support functions to scan
> for 0 or 1 bits.
>
>   
>>> -tkc
>>>       
>>> PS:  You may want to CC the mailing list so that others get a crack at
>>> answering your questions...I've been adding it back in, but you've been
>>> replying just to me.
>>>       
>> Sorry, this is the first time I am using mail-list....and always
>> forgot "reply to all"
>>
>> Thank you very much:D
>>
>>
>>
>> --
>> Li
>> ------
>> Time is all we have
>> and you may find one day
>> you have less than you think
>>     
Seems to me you'd do better with a byte array than a Python long.  At least while you're doing all those shifts.  That way you do the shifts in an int, then update a byte in the array whenever you have 8 bits accumulated.

You can then look up a bit in the array trivially.  And if you really need a Python long, I predict the single conversion will be much faster than doing all those shifts on the long as you go.





More information about the Python-list mailing list