Comparing sequences with range objects

Antoon Pardon antoon.pardon at vub.be
Tue Apr 12 02:49:50 EDT 2022



Op 11/04/2022 om 02:01 schreef duncan smith:
> On 10/04/2022 21:20, Antoon Pardon wrote:
>>
>>
>> Op 9/04/2022 om 02:01 schreef duncan smith:
>>> On 08/04/2022 22:08, Antoon Pardon wrote:
>>>>
>>>> Well my first thought is that a bitset makes it less obvious to 
>>>> calulate
>>>> the size of the set or to iterate over its elements. But it is an idea
>>>> worth exploring.
>>>>
>>>
>>>
>>>
>>> def popcount(n):
>>>     """
>>>     Returns the number of set bits in n
>>>     """
>>>     cnt = 0
>>>     while n:
>>>         n &= n - 1
>>>         cnt += 1
>>>     return cnt
>>>
>>> and not tested,
>>>
>>> def iterinds(n):
>>>     """
>>>     Returns a generator of the indices of the set bits of n
>>>     """
>>>     i = 0
>>>     while n:
>>>         if n & 1:
>>>             yield i
>>>         n = n >> 1
>>>         i += 1
>>>
>> Sure but these seem rather naive implementation with a time 
>> complexity of
>> O(n) where n is the maximum number of possible elements. Using these 
>> would
>> turn my O(n) algorithm in a O(n^2) algorithm.
>>
>
> I thought your main concern was memory. Of course, dependent on 
> various factors, you might be able to do much better than the above. 
> But I don't know what your O(n) algorithm is, how using a bitset would 
> make it O(n^2), or if the O(n^2) algorithm would actually be slower 
> for typical n. The overall thing sounds broadly like some of the 
> blocking and clustering methods I've come across in record linkage.

Using bitsets would make change my algorithm from O(n) into O(n^2) because
the bitset operations are O(n) instead of O(1). If I have a 20000 people
a bitset will take a vector of about 300, 64bit words. With 40000 people
the bitset will take a vector of about 600. So doubling the population
will also double the time for the bitset operations, meaning doubling
the population will increase execution time four times.



More information about the Python-list mailing list