List Count

Peter Otten __peter__ at web.de
Mon Apr 22 09:22:20 EDT 2013


Blind Anagram wrote:

> I would be grateful for any advice people can offer on the fastest way
> to count items in a sub-sequence of a large list.
> 
> I have a list of boolean values that can contain many hundreds of
> millions of elements for which I want to count the number of True values
> in a sub-sequence, one from the start up to some value (say hi).
> 
> I am currently using:
> 
>    sieve[:hi].count(True)
> 
> but I believe this may be costly because it copies a possibly large part
> of the sieve.
> 
> Ideally I would like to be able to use:
> 
>    sieve.count(True, hi)
> 
> where 'hi' sets the end of the count but this function is, sadly, not
> available for lists.
> 
> The use of a bytearray with a memoryview object instead of a list solves
> this particular problem but it is not a solution for me as it creates
> more problems than it solves in other aspects of the program.
> 
> Can I assume that one possible solution would be to sub-class list and
> create a C based extension to provide list.count(value, limit)?
> 
> Are there any other solutions that will avoid copying a large part of
> the list?

If the list doesn't change often you can convert it to a string

>>> items = [True, False, False] * 10
>>> sitems = "".join("FT"[i] for i in items)
>>> sitems
'TFFTFFTFFTFFTFFTFFTFFTFFTFFTFF'
>>> sitems.count("T", 3, 10)
3
>>> sitems.count("F", 3, 10)
4

Or you use a[3:10].sum() on a boolean numpy array. Its slices are views 
rather than copies:

>>> import numpy
>>> a = numpy.array([True, False, False]*10)
>>> a[3:10].sum()
3





More information about the Python-list mailing list