List Count

Tue Apr 23 15:16:51 EDT 2013

On 23 April 2013 19:30, Blind Anagram <blindanagram at nowhere.org> wrote:
> On 23/04/2013 18:45, Oscar Benjamin wrote:
>
> [snip]
>> You keep mentioning that you want it to work with a large sieve. I
>> would much rather compute the same quantities with a small sieve if
>> possible. If you were using the Lehmer/Meissel algorithm you would be
>> able to compute the same quantity (i.e. pi(1e9)) using a much smaller
>> sieve with 30k items instead of 1e9. that would fit *very* comfortably
>> in memory and you wouldn't even need to slice the list. Or to put it
>> another way, you could compute pi(~1e18) using your current sieve
>> without slicing or paging. If you want to lift the limit on computing
>> pi(x) this is clearly the way to go.
>
> If prime_pi for huge numbers was really important to me I wouldn't be
> using Python!

I would, at least to begin with. The advantage that Python has for
this sort of thing is that it takes a minimal amount of programmer
effort to implement these kind of algorithms. This is because you
don't have to worry about nuts and bolts problems like integer
overflow and memory allocation. You also have an abundance of
different data structures to fulfil the big-O requirements of most
algorithms. It's easy to make generators that can iterate over
infinite or arbitrarily large sequences while still using constant
memory. And so on...

To implement the algorithm I mentioned in e.g. C would be considerably
more work. C would, however, perform much better using your brute
force approach and would lift the memory constraints of your program
by a small (in asymptotic terms) amount.

So I actually find that I can often get a faster program in Python
simply because it's much easier to implement fancier algorithms with
the optimal asymptotic performance that will ultimately outperform a
brute force approach whichever language is used.

Although, in saying that, those particular algorithms are probably
most naturally implemented in a functional language like Haskell with
scalable recursion.

>>> In other words, the lack of a list.count(value, limit) function makes
>>> Python less effective than it would otherwise be.  I haven't looked at
>>> Python's C code base but I still wonder if there a good reason for NOT
>>> providing this?
>>
>> If you feel that this is a good suggestion for an improvement to
>> Python consider posting it on python-ideas. I wasn't aware of the
>> equivalent functionality on strings but I see that the tuple.count()
>> function is the same as list.count().
>
> To be honest, I am not really in a position to judge whether this is a
> 'good' suggestion.  It has turned up as potentially useful in two cases
> for me but one of these (the one discussed here) is a minority use case.
>  I would also be a bit worried about launching this into a group of
> Python experts who will almost certainly be even more demanding of
> explanations than you folk here!

I wouldn't worry about that. I've never felt the need for this but
then I would probably use numpy to do what you're doing. It's
certainly not an outlandish suggestion and I'd be surprised if no one
agreed with the idea.

Oscar