List Count

Blind Anagram blindanagram at nowhere.org
Tue Apr 23 03:05:53 EDT 2013


On 23/04/2013 00:01, Steven D'Aprano wrote:
> On Mon, 22 Apr 2013 15:15:19 +0100, Blind Anagram wrote:
> 
>> But when using a sub-sequence, I do suffer a significant reduction in
>> speed for a count when compared with count on the full list.  When the
>> list is small enough not to cause memory allocation issues this is about
>> 30% on 100,000,000 items.  But when the list is 1,000,000,000 items, OS
>> memory allocation becomes an issue and the cost on my system rises to
>> over 600%.
> 
> Buy more memory :-)
> 
> 
>> I agree that this is not a big issue but it seems to me a high price to
>> pay for the lack of a sieve.count(value, limit), which I feel is a
>> useful function (given that memoryview operations are not available for
>> lists).
> 
> There is no need to complicate the count method for such a specialised 
> use-case. A more general solution would be to provide list views. 
> 
> Another solution might be to use arrays rather than lists. Since your 
> sieve list is homogeneous, you could possibly use an array of 1 or 0 
> bytes rather than a list of True or False bools. That would reduce the 
> memory overhead by a factor of four, and similarly reduce the overhead of 
> any copying:

I did a lot of work comparing the overall performance of the sieve when
using either lists or arrays and I found that lists were a lot faster
for the majority use case when the sieve is not large.

   Brian




More information about the Python-list mailing list