List Count

Blind Anagram blindanagram at nowhere.org
Tue Apr 23 03:00:50 EDT 2013


On 23/04/2013 00:28, Steven D'Aprano wrote:
> On Mon, 22 Apr 2013 22:25:50 +0100, Blind Anagram wrote:
> 
>> I have looked at solutions based on listing primes and here I have found
>> that they are very much slower than my existing solution when the sieve
>> is not large (which is the majority use case).
> 
> Yes. This is hardly surprising. Algorithms suitable for dealing with the 
> first million primes are not suitable for dealing with the first trillion 
> primes, and vice versa. We like to pretend that computer programming is 
> an abstraction, and for small enough data we often can get away with 
> that, but like all abstractions eventually it breaks and the cost of 
> dealing with real hardware becomes significant.
> 
> But I must ask, given that the primes are so widely distributed, why are 
> you storing them in a list instead of a sparse array (i.e. a dict)? There 
> are 50,847,534 primes less than or equal to 1,000,000,000, so you are 
> storing roughly 18 False values for every True value. That ratio will 
> only get bigger. With a billion entries, you are using 18 times more 
> memory than necessary.

Because the majority use case for my Prime class is for a sieve that is
not large.  I am just pushing the envelope for a minority use case so
that it still works for huge sieves, albeit inefficiently.

I accept it is inefficient, but the fact remains that I can produce a
sieve that can yield and count a billion primes in a reasonable time but
this fails when I wish to count on a part of the sieve because this can
double the memory requirement for the lack of a list.count(value, limit)
function.

I would not dream of doing this job by copying a slice in any other
language that I have used so I was simply asking for advice to discover
whether this copy could be avoided whilst staying with the simple sieve
design.

Thank you for your input.

   Brian




More information about the Python-list mailing list